What is the right way to deal with a table with rows that are not in a random order? - Mailing list pgsql-general

From Douglas Alan
Subject What is the right way to deal with a table with rows that are not in a random order?
Date
Msg-id ce6334d00905271653r6e753c2dn5e9f56c735a1eede@mail.gmail.com
Whole thread Raw
Responses Re: What is the right way to deal with a table with rows that are not in a random order?  (Jeff Davis <pgsql@j-davis.com>)
Re: What is the right way to deal with a table with rows that are not in a random order?  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-general
We have a very large table (150 million rows) where the rows are not in a random order.  Some common queries will have millions of results, and this of course is slow.  For an interactive interface to the database, we want to put a limit on all queries so that queries will return quickly even if there are millions of results.

The problem we are seeing at the moment is that the statistics histograms are actually often hurting us.  Postgres sees that for some queries there are so many results that it decides to do a sequential scan, rather than using the index.  Unfortunately, sometimes all of these millions of results are at the end of the table, rather than being randomly distributed in the table, so a sequential scan is the worst possible approach.

To fix this, we could use an "order by" clause to force Postgres to use the index, but this isn't so easy as we are using Postgres through an ORM (i.e, Django in this case), and it's not so easy to try to jury-rig  things this way on a per-query basis.

Alternatively, we could randomize the ordering of the table rows, but that seems like an awfully big hammer, and will also prevent us from ordering the table on a specific index, which is different from the index in question, but might be correlated with it.

Is there a way of telling Postgres not to assume that the table rows are in a random order, so that Postgres won't make the mistake of ever doing a sequence scan on a huge table when there is a small limit?

Thanks!
|>ouglas

pgsql-general by date:

Previous
From: "douglas"
Date:
Subject: vista failed to install postgresql
Next
From: Dave Page
Date:
Subject: Re: vista failed to install postgresql