Thread: What is the right way to deal with a table with rows that are not in a random order?

We have a very large table (150 million rows) where the rows are not in a random order.  Some common queries will have millions of results, and this of course is slow.  For an interactive interface to the database, we want to put a limit on all queries so that queries will return quickly even if there are millions of results.

The problem we are seeing at the moment is that the statistics histograms are actually often hurting us.  Postgres sees that for some queries there are so many results that it decides to do a sequential scan, rather than using the index.  Unfortunately, sometimes all of these millions of results are at the end of the table, rather than being randomly distributed in the table, so a sequential scan is the worst possible approach.

To fix this, we could use an "order by" clause to force Postgres to use the index, but this isn't so easy as we are using Postgres through an ORM (i.e, Django in this case), and it's not so easy to try to jury-rig  things this way on a per-query basis.

Alternatively, we could randomize the ordering of the table rows, but that seems like an awfully big hammer, and will also prevent us from ordering the table on a specific index, which is different from the index in question, but might be correlated with it.

Is there a way of telling Postgres not to assume that the table rows are in a random order, so that Postgres won't make the mistake of ever doing a sequence scan on a huge table when there is a small limit?

Thanks!
|>ouglas
On Wed, 2009-05-27 at 19:53 -0400, Douglas Alan wrote:
> We have a very large table (150 million rows) where the rows are not
> in a random order.  Some common queries will have millions of results,
> and this of course is slow.  For an interactive interface to the
> database, we want to put a limit on all queries so that queries will
> return quickly even if there are millions of results.
>
> The problem we are seeing at the moment is that the statistics
> histograms are actually often hurting us.  Postgres sees that for some
> queries there are so many results that it decides to do a sequential
> scan, rather than using the index.  Unfortunately, sometimes all of
> these millions of results are at the end of the table, rather than
> being randomly distributed in the table, so a sequential scan is the
> worst possible approach.

If you're putting a LIMIT on it, why does it return millions of results?

> To fix this, we could use an "order by" clause to force Postgres to
> use the index, but this isn't so easy as we are using Postgres through
> an ORM (i.e, Django in this case), and it's not so easy to try to
> jury-rig  things this way on a per-query basis.
>

ORDER BY won't always choose an index scan, for instance if the
correlation is low and the expected results are many.

Can you pick out an interesting query and give some specifics, like:
* the query
* the EXPLAIN ANALYZE output (or EXPLAIN without ANALYZE if it takes too
long to even run once)
* EXPLAIN ANALYZE output if you force the index scan
* the statistics for the relevant columns, such as histogram and
correlation

Regards,
    Jeff Davis


On Wed, May 27, 2009 at 8:54 PM, Jeff Davis <pgsql@j-davis.com> wrote:
 
If you're putting a LIMIT on it, why does it return millions of results?

It doesn't return millions of results with the LIMIT on it.  It just does a sequential scan of the table and doesn't find any results until it gets to the last quarter of the table. Sequentially scanning through 3/4 of the huge table before it gets a single match takes a very long time.

As I said, in my original post, Postgres's approach would be completely reasonable in this case, if the rows that it was looking for were sprinkled randomly throughout the table.  But they're not in this case -- they're all at the end.

Can you pick out an interesting query and give some specifics, like:
* the query
* the EXPLAIN ANALYZE output (or EXPLAIN without ANALYZE if it takes too
long to even run once)
* EXPLAIN ANALYZE output if you force the index scan
* the statistics for the relevant columns, such as histogram and
correlation

As I mentioned, the situation is very simple, and easy to understand what is going on.  There's absolutely no mystery as to why Postgres is doing what it's doing.  25% of the table matches the query.  Postgres knows this due to the statistics histogram for the column.  Postgres is deciding to do a sequential scan because it knows that 25% of the rows match the query.

Unfortunately, in this case, that's a poor approach.

|>ouglas
On Wed, 2009-05-27 at 19:53 -0400, Douglas Alan wrote:
> We have a very large table (150 million rows) where the rows are not
> in a random order.  Some common queries will have millions of results,
> and this of course is slow.  For an interactive interface to the
> database, we want to put a limit on all queries so that queries will
> return quickly even if there are millions of results.

Partition the table, then scan the correct partition.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Douglas Alan <darkwater42@gmail.com> writes:
> As I said, in my original post, Postgres's approach would be completely
> reasonable in this case,* if* the rows that it was looking for were
> sprinkled randomly throughout the table.  But they're *not* in this case --
> they're all at the end.

There's been some talk of penalizing the seqscan+limit combination
(perhaps by increasing the estimated start cost for the seqscan) if
the WHERE clause involves any variables that have a correlation stat
significantly different from zero.  But nobody's done the legwork
to see if this would really be useful or what an appropriate penalty
curve might be.

            regards, tom lane

On Wed, May 27, 2009 at 8:43 PM, Douglas Alan <darkwater42@gmail.com> wrote:
> On Wed, May 27, 2009 at 8:54 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
>>
>> If you're putting a LIMIT on it, why does it return millions of results?
>
> It doesn't return millions of results with the LIMIT on it.  It just does a
> sequential scan of the table and doesn't find any results until it gets to
> the last quarter of the table. Sequentially scanning through 3/4 of the huge
> table before it gets a single match takes a very long time.
>
> As I said, in my original post, Postgres's approach would be completely
> reasonable in this case, if the rows that it was looking for were sprinkled
> randomly throughout the table.  But they're not in this case -- they're all
> at the end.
>
>> Can you pick out an interesting query and give some specifics, like:
>> * the query
>> * the EXPLAIN ANALYZE output (or EXPLAIN without ANALYZE if it takes too
>> long to even run once)
>> * EXPLAIN ANALYZE output if you force the index scan
>> * the statistics for the relevant columns, such as histogram and
>> correlation
>
> As I mentioned, the situation is very simple, and easy to understand what is
> going on.  There's absolutely no mystery as to why Postgres is doing what
> it's doing.  25% of the table matches the query.  Postgres knows this due to
> the statistics histogram for the column.  Postgres is deciding to do a
> sequential scan because it knows that 25% of the rows match the query.
>
> Unfortunately, in this case, that's a poor approach.

Postgres has a very smart query planner, that has a lot of knobs you
can fiddle with to change how and when it changes from one plan to
another.  It's possible by changing some of those that pgsql will use
a different query plan that is more efficient.  Seeing things like
explain analyze might help someone make a suggestion.  OTOH, if you've
got it all sussed out, then ignore the request for more information.

Note that in the OPs case I'd probably try testing things like turning
off seqscan, or lowering random_page_cost.  I'd also look at
clustering on the index for the field you're selecting on.

On Thu, May 28, 2009 at 4:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

> Partition the table, then scan the correct partition.

If I do that, will Postgres figure out the "right thing" to do if the
parent table is queried instead?  Also, what are the performance
implications then for doing queries that span all the partitions,
which will be the norm for our application?

The application in question is a kind of data warehousing thing (of
astronomical stars), and there's an ORM in the middle, so it's not
easy for us to hand-tune how individual queries are specified.
Instead, we have to structure the database and the indexes so that
things generally perform well, without having to tweak specific
queries.

Users can specify fairly arbitrary search criteria.  All of the
queries should perform well.  By "well", I mean within 10 seconds or
so.  Scanning all of the 150 million rows takes much longer than 10
seconds, unfortunately.

Any one of these "solutions" will cause Postgres to do an index scan
in the problematic case where Postgres is deciding to a sequential
scan.  The index scan performs snappily enough:

   - Using "order by" on the query.

   - Changing the search value for the column to a value that occurs
less frequently.

   - Fetching the value to search for via a sub-query so that Postgres
can't determine a priori that the
     value being searched value occurs so commonly.

Unfortunately, as I mentioned, due to the ORM, none of these solutions
really work for us in practice, as opposed to at a psql prompt.

|>ouglas

On Thu, May 28, 2009 at 10:24 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

>  OTOH, if you've got it all sussed out, then ignore the request for more information.

I don't *know* if I have it "all sussed out", but I *do* know why
Postgres is doing what it is doing in this particular case.  It's
assuming that the value in question is evenly distributed throughout
the table, when in actuality, the value in question is clustered at
the very end of the table.

What I don't have sussed out his how best to address this issue.

The most obvious solution would be an option to tell Postgres not to
assume that the value is evenly distributed throughout the table, and
to take account of the fact that the data in question might very well
be clustered at the very end of the table.

|>ouglas

On Thu, May 28, 2009 at 10:41 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

> Note that in the OPs case I'd probably try testing things like turning
> off seqscan, or lowering random_page_cost.  I'd also look at
> clustering on the index for the field you're selecting on.

I'm worried that turning off seqscan would distort other queries.  (Remember, I don't have control of the individual queries.  The user of the application can specify all sorts of queries, and there's an ORM in the middle.)

In order to force Postgres to do an index scan for this query, I had to set random_page_cost to 0.  Even 0.1 was not small enough.  Alternatively, I could set seq_page_cost to 39.  (38 was not big enough.)  Again, I'm worried that by using such a big hammer, I would distort Postgres's query planning for other queries.

|>ouglas

P.S. Here is the actual data that some people have been clamoring for:

Slow query:

psql> select * from maindb_astobject
     where survey_id = 2
     limit 20 offset 1000;

Time: 18073.691 ms


Here's the explanation:

                                    QUERY PLAN                                      
-------------------------------------------------------------------------------------
Limit  (cost=47.99..48.95 rows=20 width=153)
  ->  Seq Scan on maindb_astobject  (cost=0.00..3538556.10 rows=73736478 width=153)
        Filter: (survey_id = 2)

The same query made fast by removing Postgres's ability to know a priori what particular value is being searched for:

psql> select * from maindb_astobject join maindb_enumentity
     on maindb_astobject.survey_id = maindb_enumentity.id
     where entityname = 'MACHO'
     limit 20 offet 1000;

Time: 1.638 ms

Here's the explanation for the above query showing that it is using the index:

QUERY PLAN                                    
  --------------------------------------------------------------------------
   Limit  (cost=164.97..168.27 rows=20 width=215)
     ->  Nested Loop  (cost=0.00..1233523.72 rows=7477081 width=215)
   ->  Seq Scan on maindb_enumentity  (cost=0.00..1.12 rows=1 width=62)
 Filter: ((entityname)::text = 'MACHO'::text)
   ->  Index Scan using maindb_astobject_survey_id on maindb_astobject  (cost=0.00..1046595.57 rows=14954162 width=153)
 Index Cond: (maindb_astobject.survey_id = maindb_enumentity.id)


P.P.S. Many other queries were very slow due to Postgres wanting to use hash joins and merge joins rather than nested index scans.  The nested index scans were, in fact, much faster.  I just turned off hash joins and merge joins:  problem solved.  It might be nice at some point to figure out what is going with Postgres trying to use these fancier joins that turn out to be much slower for us, but that's a worry for another day, since I have a perfectly good work-around at the moment.

On Thu, May 28, 2009 at 1:12 PM, Douglas Alan <darkwater42@gmail.com> wrote:
> On Thu, May 28, 2009 at 10:24 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>
>>  OTOH, if you've got it all sussed out, then ignore the request for more information.
>
> I don't *know* if I have it "all sussed out", but I *do* know why
> Postgres is doing what it is doing in this particular case.  It's
> assuming that the value in question is evenly distributed throughout
> the table, when in actuality, the value in question is clustered at
> the very end of the table.

It's doing way more than that.  My point above was that the query
planner is not JUST assuming the values are wel ordered.  It's
assuming random_page_cost is x times more than sequential page cost,
it's assuming the table doesn't fit in effective cache, or shared
buffers, it's assuming lots of things based on how you've tuned (or
not) your database.

I'll finish in reply to your other post.

On Thu, May 28, 2009 at 1:45 PM, Douglas Alan <darkwater42@gmail.com> wrote:
> On Thu, May 28, 2009 at 10:41 AM, Scott Marlowe <scott.marlowe@gmail.com>
> wrote:
>
>> Note that in the OPs case I'd probably try testing things like turning
>> off seqscan, or lowering random_page_cost.  I'd also look at
>> clustering on the index for the field you're selecting on.
>
> I'm worried that turning off seqscan would distort other queries.
>  (Remember, I don't have control of the individual queries.  The user of the
> application can specify all sorts of queries, and there's an ORM in the
> middle.)

You are aware you can turn off seq scans for just the current connection, right?

set enable_seqscan=off;

> In order to force Postgres to do an index scan for this query, I had to set
> random_page_cost to 0.  Even 0.1 was not small enough.  Alternatively, I
> could set seq_page_cost to 39.  (38 was not big enough.)  Again, I'm worried
> that by using such a big hammer, I would distort Postgres's query planning
> for other queries.

No one's saying to do it all the time.  They're saying to do it and
then run explain analyze on your query, then post the results of both
let us have a look.

> P.S. Here is the actual data that some people have been clamoring for:

No, it's not.  It's just explain output.  Unless the actual query
running takes hours, it's much more useful to have explain analyze
output.  With the analyze part, it's only showing what the planner
expects, not what actually happens.

> Slow query:
>
> psql> select * from maindb_astobject
>      where survey_id = 2
>      limit 20 offset 1000;

You'll notice that in your previous posts you never mentioned offset,
which greatly affects the plan chosen.

Is survey_id indexed?

> Time: 18073.691 ms
>
> Here's the explanation:
>
>                                     QUERY PLAN
>
> -------------------------------------------------------------------------------------
> Limit  (cost=47.99..48.95 rows=20 width=153)
>   ->  Seq Scan on maindb_astobject  (cost=0.00..3538556.10 rows=73736478
> width=153)
>         Filter: (survey_id = 2)
>
> The same query made fast by removing Postgres's ability to know a priori
> what particular value is being searched for:
>
> psql> select * from maindb_astobject join maindb_enumentity
>      on maindb_astobject.survey_id = maindb_enumentity.id
>      where entityname = 'MACHO'
>      limit 20 offet 1000;

That's not the same query.

> Time: 1.638 ms
>
> Here's the explanation for the above query showing that it is using the
> index:

So, can we see explain analyze?

> QUERY PLAN
>   --------------------------------------------------------------------------
>    Limit  (cost=164.97..168.27 rows=20 width=215)
>      ->  Nested Loop  (cost=0.00..1233523.72 rows=7477081 width=215)
>    ->  Seq Scan on maindb_enumentity  (cost=0.00..1.12 rows=1 width=62)
>  Filter: ((entityname)::text = 'MACHO'::text)
>    ->  Index Scan using maindb_astobject_survey_id on maindb_astobject
>  (cost=0.00..1046595.57 rows=14954162 width=153)
>  Index Cond: (maindb_astobject.survey_id = maindb_enumentity.id)
>
>
> P.P.S. Many other queries were very slow due to Postgres wanting to use hash
> joins and merge joins rather than nested index scans.

Then it's quite possible you have a problem with misestimation of
values in your db.  Since we have no explain ANALYZE output, we cannot
judge if this is the case.


> The nested index
> scans were, in fact, much faster.  I just turned off hash joins and merge
> joins:  problem solved.

One problem solved, another one created is more likely.

> It might be nice at some point to figure out what
> is going with Postgres trying to use these fancier joins that turn out to be
> much slower for us, but that's a worry for another day, since I have a
> perfectly good work-around at the moment.

Yes, let us see explain analyze output and maybe we can help.  You
know, the thing that was asked for at the beginning.

ORM should'nt be a factor as you have many configureable properties available in hibernate.cfg.xml or hibernate.properties
hibernate.connection.driver_classjdbc driver class
hibernate.connection.urljdbc URL
hibernate.connection.usernamedatabase user
hibernate.connection.passworddatabase user password
hibernate.connection.pool_sizemaximum number of pooled connections
Hibernate's own connection pooling algorithm is however quite rudimentary. It is intended to help you get started and is not intended for use in a production system or even for performance testing. You should use a third party pool for best performance and stability. Just replace the hibernate.connection.pool_size property with connection pool specific settings. This will turn off Hibernate's internal pool. For example, you might like to use C3P0.
C3P0 is an open source JDBC connection pool distributed along with Hibernate in the lib directory. Hibernate will use its C3P0ConnectionProvider for connection pooling if you set hibernate.c3p0.* properties. If you'd like to use Proxool refer to the packaged hibernate.properties and the Hibernate web site for more information.
Here is an example hibernate.properties file for C3P0:
hibernate.connection.driver_class = org.postgresql.Driver
hibernate.connection.url = jdbc:postgresql://localhost/mydatabase
hibernate.connection.username = myuser
hibernate.connection.password = secret
hibernate.c3p0.min_size=5
hibernate.c3p0.max_size=20
hibernate.c3p0.timeout=1800
hibernate.c3p0.max_statements=50
hibernate.dialect = org.hibernate.dialect.PostgreSQLDialect
https://www.hibernate.org/214.html

Martin Gainty
______________________________________________
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le destinataire prévu, nous te demandons avec bonté que pour satisfaire informez l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est interdite. Ce message sert à l'information seulement et n'aura pas n'importe quel effet légalement obligatoire. Étant donné que les email peuvent facilement être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité pour le contenu fourni.





> From: darkwater42@gmail.com
> Date: Thu, 28 May 2009 15:03:32 -0400
> Subject: Re: [GENERAL] What is the right way to deal with a table with rows that are not in a random order?
> To: pgsql-general@postgresql.org
> CC: simon@2ndquadrant.com
>
> On Thu, May 28, 2009 at 4:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> > Partition the table, then scan the correct partition.
>
> If I do that, will Postgres figure out the "right thing" to do if the
> parent table is queried instead?  Also, what are the performance
> implications then for doing queries that span all the partitions,
> which will be the norm for our application?
>
> The application in question is a kind of data warehousing thing (of
> astronomical stars), and there's an ORM in the middle, so it's not
> easy for us to hand-tune how individual queries are specified.
> Instead, we have to structure the database and the indexes so that
> things generally perform well, without having to tweak specific
> queries.
>
> Users can specify fairly arbitrary search criteria.  All of the
> queries should perform well.  By "well", I mean within 10 seconds or
> so.  Scanning all of the 150 million rows takes much longer than 10
> seconds, unfortunately.
>
> Any one of these "solutions" will cause Postgres to do an index scan
> in the problematic case where Postgres is deciding to a sequential
> scan.  The index scan performs snappily enough:
>
>    - Using "order by" on the query.
>
>    - Changing the search value for the column to a value that occurs
> less frequently.
>
>    - Fetching the value to search for via a sub-query so that Postgres
> can't determine a priori that the
>      value being searched value occurs so commonly.
>
> Unfortunately, as I mentioned, due to the ORM, none of these solutions
> really work for us in practice, as opposed to at a psql prompt.
>
> |>ouglas
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Hotmail® has ever-growing storage! Don’t worry about storage limits. Check it out.
On Thu, 2009-05-28 at 15:03 -0400, Douglas Alan wrote:

> The application in question is a kind of data warehousing thing (of
> astronomical stars), and there's an ORM in the middle, so it's not
> easy for us to hand-tune how individual queries are specified.
> Instead, we have to structure the database and the indexes so that
> things generally perform well, without having to tweak specific
> queries.

You should look at indexing solutions, or take professional advice on
that.

> Users can specify fairly arbitrary search criteria.  All of the
> queries should perform well.  By "well", I mean within 10 seconds or
> so.

That's an unrealistic expectation, unfortunately. Fast search requires
some form of preparatory action and without any clue as to what that
should be you cannot guarantee response times.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


On Thu, May 28, 2009 at 5:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, 2009-05-28 at 15:03 -0400, Douglas Alan wrote:

> The application in question is a kind of data warehousing thing (of
> astronomical stars), and there's an ORM in the middle, so it's not
> easy for us to hand-tune how individual queries are specified.
> Instead, we have to structure the database and the indexes so that
> things generally perform well, without having to tweak specific
> queries.

You should look at indexing solutions, or take professional advice on
that.

I'm not sure that I know what an "indexing solution" is.
 

> Users can specify fairly arbitrary search criteria.  All of the
> queries should perform well.  By "well", I mean within 10 seconds or
> so.

That's an unrealistic expectation, unfortunately. Fast search requires
some form of preparatory action and without any clue as to what that
should be you cannot guarantee response times.

Unrealistic or not, it mostly currently works, modulo Postgres sometimes deciding to do a slow sequence scan when there is a perfectly good index.

For instance, I just tried a query that results in 137,042 results (out of 150 million rows) is constrained in several different ways, involves a table join on another large table, and it took nine seconds.

|>ouglas

On Thu, 2009-05-28 at 15:12 -0400, Douglas Alan wrote:
> The most obvious solution would be an option to tell Postgres not to
> assume that the value is evenly distributed throughout the table, and
> to take account of the fact that the data in question might very well
> be clustered at the very end of the table.

There's no use adding a new statistic (user supplied or collected) to
PostgreSQL until we know that correlation isn't useful for that purpose.
Can you tell us the correlation that PostgreSQL has already collected (I
apologize if I missed this elsewhere in the thread)?

Hopefully, correlation is useful enough. With some planner tweaks
similar to the ones Tom mentioned, and a few more data points, maybe
we'll have a real solution.

Regards,
    Jeff Davis


Scott Marlowe <scott.marlowe@gmail.com> wrote:

> Douglas Alan wrote:

>> I'm worried that turning off seqscan would distort other queries.
>> (Remember, I don't have control of the individual queries.  The
>> user of the application can specify all sorts of queries, and
>> there's an ORM in the middle.)

> You are aware you can turn off seq scans for just the current
> connection, right?

Yes, of course.  I thought that the suggestion was to change this
setting in production, not just for diagnostic purposes.  As I
previously reported, I've already changed some settings for diagnostic
purposes and also reported what values I had to set them to to force
an index scan for the query in question.

> No one's saying to do it all the time.  They're saying to do it and
> then run explain analyze on your query, then post the results of
> both let us have a look.

Okay -- no problem:

   set enable_seqscan = on;
   explain analyze select * from maindb_astobject
   where survey_id = 2
   limit 1000;

   "Limit  (cost=0.00..48.03 rows=1000 width=78) (actual
time=84837.835..265938.258 rows=1000 loops=1)"
   "  ->  Seq Scan on maindb_astobject  (cost=0.00..3538556.10
rows=73675167 width=78) (actual time=84837.825..265932.121 rows=1000
loops=1)"
   "        Filter: (survey_id = 2)"
   "Total runtime: 265942.416 ms"


   set enable_seqscan = off;
   explain analyze select * from maindb_astobject
   where survey_id = 2
   limit 1000;

   "Limit  (cost=0.00..67.37 rows=1000 width=78) (actual
time=172.248..225.219 rows=1000 loops=1)"
   "  ->  Index Scan using maindb_astobject_survey_id on
maindb_astobject  (cost=0.00..4963500.87 rows=73675167 width=78)
(actual time=172.240..221.078 rows=1000 loops=1)"
   "        Index Cond: (survey_id = 2)"
   "Total runtime: 227.412 ms"

>> Slow query:
>>
>> psql> select * from maindb_astobject
>>      where survey_id = 2
>>      limit 20 offset 1000;

> You'll notice that in your previous posts you never mentioned offset,
> which greatly affects the plan chosen.

It makes little difference for this query.  The plan is the same with
or without the offset.  The purpose of the offset was just to more or
less simulate a "limit 1020" without getting deluged by 1,020 results.

> Is survey_id indexed?

Yes, of course.

>> P.P.S. Many other queries were very slow due to Postgres wanting to use hash
>> joins and merge joins rather than nested index scans.

> Then it's quite possible you have a problem with misestimation of
> values in your db.  Since we have no explain ANALYZE output, we cannot
> judge if this is the case.

Okay, well if you can also tell me how to fix the hash join / merge
join problem we are seeing without resorting to the current hack I am
using to fix it, I will be very thankful!

>> The nested index
>> scans were, in fact, much faster.  I just turned off hash joins and merge
>> joins:  problem solved.

>One problem solved, another one created is more likely.

Such as?  I haven't yet seen a nested index scan perform poorly yet.

Though if there is some database parameter that is set incorrectly,
and this is causing the hash join / merge join problem, then I can see
that having that parameter not be set correctly could be a source of
many future woes, so I certainly would appreciate any insight into
that.

|>ouglas


P.S. Here are the stats on the column. It appears that my recollection
of 25% of the table matching was a bit off. It's actually 98.5%!  That
might explain more why Postgres wants to do a sequential scan. The
problem is that still means that it has to scan a million rows
sequentially before it finds a single matching row, as the matching
rows are at the end of the database:

select * from pg_stats
where tablename='maindb_astobject' and attname = 'survey_id';

 schemaname |    tablename     |  attname  | null_frac | avg_width |
n_distinct | most_common_vals |                most_common_freqs
          | histogram_bounds | correlation

------------+------------------+-----------+-----------+-----------+------------+------------------+--------------------------------------------------+------------------+-------------
 public     | maindb_astobject | survey_id |         0 |         4 |
       5 | {2,4,10,3,5}     |
{0.985347,0.00966,0.00286667,0.0019,0.000226667} |                  |
  0.998872

On Thu, May 28, 2009 at 6:45 PM, Douglas Alan <darkwater42@gmail.com> wrote:
> Scott Marlowe <scott.marlowe@gmail.com> wrote:
>
>> Douglas Alan wrote:
>
>>> I'm worried that turning off seqscan would distort other queries.
>>> (Remember, I don't have control of the individual queries.  The
>>> user of the application can specify all sorts of queries, and
>>> there's an ORM in the middle.)
>
>> You are aware you can turn off seq scans for just the current
>> connection, right?
>
> Yes, of course.  I thought that the suggestion was to change this
> setting in production, not just for diagnostic purposes.  As I
> previously reported, I've already changed some settings for diagnostic
> purposes and also reported what values I had to set them to to force
> an index scan for the query in question.

Well there are varying degrees of doing this in production, if needed.
 You can use a separate account for these queries that has different
settings so the query planner makes the right decision etc.

>> No one's saying to do it all the time.  They're saying to do it and
>> then run explain analyze on your query, then post the results of
>> both let us have a look.
>
> Okay -- no problem:
>
>    set enable_seqscan = on;
>    explain analyze select * from maindb_astobject
>    where survey_id = 2
>    limit 1000;
>
>    "Limit  (cost=0.00..48.03 rows=1000 width=78) (actual
> time=84837.835..265938.258 rows=1000 loops=1)"
>    "  ->  Seq Scan on maindb_astobject  (cost=0.00..3538556.10
> rows=73675167 width=78) (actual time=84837.825..265932.121 rows=1000
> loops=1)"
>    "        Filter: (survey_id = 2)"
>    "Total runtime: 265942.416 ms"
>
>    set enable_seqscan = off;
>    explain analyze select * from maindb_astobject
>    where survey_id = 2
>    limit 1000;
>
>    "Limit  (cost=0.00..67.37 rows=1000 width=78) (actual
> time=172.248..225.219 rows=1000 loops=1)"
>    "  ->  Index Scan using maindb_astobject_survey_id on
> maindb_astobject  (cost=0.00..4963500.87 rows=73675167 width=78)
> (actual time=172.240..221.078 rows=1000 loops=1)"
>    "        Index Cond: (survey_id = 2)"
>    "Total runtime: 227.412 ms"

What was the random_page_cost during these two queries?  Assuming
seq_page_cost is 1, and random_page_cost was 4 or something, lowering
it should force the move to an index scan.  If you were already at 1.0
or so, then yeah, the cost estimation is off.  Since index scans cost
CPU (at least I think they do), you might try lowering your cpu_*
costs to see if that helps


>
>>> Slow query:
>>>
>>> psql> select * from maindb_astobject
>>>      where survey_id = 2
>>>      limit 20 offset 1000;
>
>> You'll notice that in your previous posts you never mentioned offset,
>> which greatly affects the plan chosen.
>
> It makes little difference for this query.  The plan is the same with
> or without the offset.  The purpose of the offset was just to more or
> less simulate a "limit 1020" without getting deluged by 1,020 results.

Ah yes, I can see that now.

>>> P.P.S. Many other queries were very slow due to Postgres wanting to use hash
>>> joins and merge joins rather than nested index scans.
>
>> Then it's quite possible you have a problem with misestimation of
>> values in your db.  Since we have no explain ANALYZE output, we cannot
>> judge if this is the case.
>
> Okay, well if you can also tell me how to fix the hash join / merge
> join problem we are seeing without resorting to the current hack I am
> using to fix it, I will be very thankful!

Most of the time that happens, it's a question of estimates being way
off, OR various cost values.  Explain analyze with hash joins enabled
and disabled can give you an idea where the planner might be making a
mistake.  With large tables it's a good idea to crank up your default
stats target and reanalyzing anyway, so that's worth a first try.

>>> The nested index
>>> scans were, in fact, much faster.  I just turned off hash joins and merge
>>> joins:  problem solved.
>
>>One problem solved, another one created is more likely.
>
> Such as?  I haven't yet seen a nested index scan perform poorly yet.

Who knows?  Pgsql has enough other join methods it's not likely to be
a huge issue.  But if your type of query changes a lot, the nested
loops might get really slow really fast.

> Though if there is some database parameter that is set incorrectly,
> and this is causing the hash join / merge join problem, then I can see
> that having that parameter not be set correctly could be a source of
> many future woes, so I certainly would appreciate any insight into
> that.

Most common cause of these being the wrong choice are default stats
target too low for your data.  But you've only got like 4 values in
this one field, so I doubt that has a big effect.

>
> |>ouglas
>
>
> P.S. Here are the stats on the column. It appears that my recollection
> of 25% of the table matching was a bit off. It's actually 98.5%!  That
> might explain more why Postgres wants to do a sequential scan. The
> problem is that still means that it has to scan a million rows
> sequentially before it finds a single matching row, as the matching
> rows are at the end of the database:

Yeah, that's a really skewed distribution.  Partitioning may work out,
especially if you often select on that one field.

Scott Marlowe wrote:

> Douglas Alan wrote:

>> Okay -- no problem:
>>
>>    set enable_seqscan = on;
>>    explain analyze select * from maindb_astobject
>>    where survey_id = 2
>>    limit 1000;
>>
>>    "Limit  (cost=0.00..48.03 rows=1000 width=78) (actual
>> time=84837.835..265938.258 rows=1000 loops=1)"
>>    "  ->  Seq Scan on maindb_astobject  (cost=0.00..3538556.10
>> rows=73675167 width=78) (actual time=84837.825..265932.121 rows=1000
>> loops=1)"
>>    "        Filter: (survey_id = 2)"
>>    "Total runtime: 265942.416 ms"
>>
>>    set enable_seqscan = off;
>>    explain analyze select * from maindb_astobject
>>    where survey_id = 2
>>    limit 1000;
>>
>>    "Limit  (cost=0.00..67.37 rows=1000 width=78) (actual
>> time=172.248..225.219 rows=1000 loops=1)"
>>    "  ->  Index Scan using maindb_astobject_survey_id on
>> maindb_astobject  (cost=0.00..4963500.87 rows=73675167 width=78)
>> (actual time=172.240..221.078 rows=1000 loops=1)"
>>    "        Index Cond: (survey_id = 2)"
>>    "Total runtime: 227.412 ms"

> What was the random_page_cost during these two queries?

4

> Assuming seq_page_cost is 1,

Yes, it is.

> and random_page_cost was 4 or something, lowering it should
> force the move to an index scan.

I just tried changing random_page_cost to 1, but the query still
does a seq scan.

> If you were already at 1.0 or so, then yeah, the cost
> estimation is off.  Since index scans cost CPU (at least I
> think they do), you might try lowering your cpu_* costs to see
> if that helps

How would lowering random_page_cost and all the cpu costs differ
from just increasing seq_page cost?

I have to raise seq_page_cost from 1 to 34 to force an index
scan.  I can't imagine that changing this value so radically be a
good idea.

Alternatively, if I set random_page_cost to 1, and
cpu_tuple_cost, cpu_index_tuple_cost, and cpu_operator_cost all
to 0, this still doesn't cause Postgres to do an index scan for
this query.

>> P.S. Here are the stats on the column. It appears that my recollection
>> of 25% of the table matching was a bit off. It's actually 98.5%!  That
>> might explain more why Postgres wants to do a sequential scan. The
>> problem is that still means that it has to scan a million rows
>> sequentially before it finds a single matching row, as the matching
>> rows are at the end of the database:

> Yeah, that's a really skewed distribution.  Partitioning may work out,
> especially if you often select on that one field.

Is there a way for me to alter the statistics table?  I tried
changing the values in pg_stats, but that table is just a view,
so Postgres won't let me do it.  pg_statistic, on the other hand,
is rather opaque.

Alternatively, can I delete the statistics for the column.  It's
the statistics that are hurting me here.

To delete the statistics, I tried setting statistics for the
column to 0 and analyzing the column, but that just left the
current statistics in place.  So I tried setting statistics to 1,
but that's one value to many to eliminate this problem!

|>ouglas