Optimizing search query with sorting by creation field - Mailing list pgsql-general

From Droid Tools
Subject Optimizing search query with sorting by creation field
Date
Msg-id CABbQ-pUrdq9+Rh8myo=JEze3Ezdq9Tj60knAFxZ59Aqs0WrWGA@mail.gmail.com
Whole thread Raw
Responses Re: Optimizing search query with sorting by creation field
List pgsql-general
Hi, 

I'm looking for tips on optimizing a search query where someone searches for content within a post and wants to sort the results by a timestamp. I have a table `posts` with a `created` field (timestamp) and a post_tsv column (TSVECTOR). Likewise I have a GIN Index on the `post_tsv` field and a separate index on the `created` field. 

My initial, naive, attempt was to simply to do something like: 

```
SELECT * FROM posts WHERE post_tsv @@ websearch_to_tsquery(?) ORDER BY created DESC
```

However, I didn't realize in this case the `created` index would be ignored, which means if there was a large number of posts returned, this query would take several seconds to execute. Also the planner would do weird things even if the result set was small and still take several seconds to execute. Currently I've papered over the problem by issuing a subquery and sorting that instead.

 ```
SELECT * FROM 
  (SELECT * FROM posts WHERE post_tsv @@ websearch_to_tsquery(?) LIMIT 10,000) q
ORDER BY created DESC
```

In short I execute the search, limit that to 10,000 rows, and then order the 10,000 rows that were returned. This worked amazingly for queries that returned fewer than 10,000 rows as those queries went from taking several seconds to run down to a handful of milliseconds. The problem is for queries with more than 10,000 rows you essentially end up with random results. I'm still not using the created index, but sorting 10,000 rows in memory is relatively fast.

I'm stuck where to go from here - what I would like, since I know I will only ever ORDER BY the created field is to build some index where the default ordering is by the created field. GIN, as I understand it, doesn't support indexing in this manner (using one of the columns as a sort field). Is there anything else I could try?

pgsql-general by date:

Previous
From: Adrien Nayrat
Date:
Subject: Re: "invalid contrecord" error on replica
Next
From: Atul Kumar
Date:
Subject: idle_in_transaction_session_timeout