Re: GIN: any ordering guarantees for the hits returned? - Mailing list pgsql-general

From David Fetter
Subject Re: GIN: any ordering guarantees for the hits returned?
Date
Msg-id 20071118175941.GR28860@fetter.org
Whole thread Raw
In response to GIN: any ordering guarantees for the hits returned?  (adrobj <adrobj@yahoo.com>)
Responses Re: GIN: any ordering guarantees for the hits returned?  (Sam Mason <sam@samason.me.uk>)
Re: GIN: any ordering guarantees for the hits returned?  (Alex Drobychev <adrobj@yahoo.com>)
List pgsql-general
On Fri, Nov 16, 2007 at 07:56:45PM -0800, adrobj wrote:
>
> Hello,
>
> I have a moderately large (~10-20GB) table:
>
> CREATE TABLE msgs (
>   msg varchar(2048),
>   msg_tsv tsvector,
>   posted timestamp
> );
>
> CREATE INDEX msgs_i ON msgs USING gin(msg_tsv);
>
> The table never gets updated (more specifically, it gets re-created
> once a day with no updates in between).
>
> I want to run queries of the following form:
>
> SELECT msg, posted FROM msgs WHERE 'blah blah'::tsquery @@ msg_tsv
> ORDERED BY posted DESC; (with various LIMIT/OFFSET)
>
> Which obviously may get too expensive, for it will cause reading and
> sorting of all rows meeting the condition, i.e. too many disk reads.
>
> On the other hand, (as far as I understand) GIN always produces hits
> already sorted in the insertion order.
>
> So - what if I just populate my table in the order of decreasing
> 'posted', remove the "ORDERED BY" clause and just hope for the best?
> Will the correct ordering be guaranteed?

Ordering is never guaranteed without an ORDER BY, except in the time
between a CLUSTER and the first write operation after it.

> If not, are there any other ideas around?

Rather than assuming you know where problems will arise, do some
profiling and find out where they actually do :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: regexp_replace() function in new version
Next
From: Oliver Kohll
Date:
Subject: Fwd: max_fsm_relations