Re: Why are selects so slow on large tables, even - Mailing list pgsql-general

From Jason Earl
Subject Re: Why are selects so slow on large tables, even
Date
Msg-id 87hen02ue3.fsf@npa01zz001.simplot.com
Whole thread Raw
In response to Why are selects so slow on large tables, even when indexed?  ("Robert Wille" <rwille@iarchives.com>)
List pgsql-general
"Robert Wille" <rwille@iarchives.com> writes:

> The suggested fixes have helped a lot, but it is still rather
> slow. The time varies and can be upwards of 10 to 20 seconds on a
> ~47M row table. Is this normal? Similar queries on an indexed
> varchar column in Oracle with about 1/2 as many rows execute at
> least a hundred times faster.

I realize that it has taken me quite a while to get back to you, on
this particular case but I have been running some tests on your data
(or 28 million lines of it anyway) and I have some stuff to share.

First of all, your sample data set has the *opposite* problem of most
queries that PostgreSQL users complain about.  Most people complain
about queries that do sequential scans when PostgreSQL should be doing
an index scan.  Your data, on the other hand, caused PostgreSQL to do
an indexscan when it probably should have been doing a sequential
scan.  After all, there are only 1000 (or so) different unique values
of "id" and the instances of each value are spread evenly throughout
the table.  Since you are going to touch most pages anyhow consulting
the index is just an extra step.  On my limited test machine I
actually saw modest gains on queries like:

SELECT count(*) FROM a WHERE id = 89::bigint;

However, if you cluster the index on id then similar queries started
to return *immediately*.  Don't forget to vacuum after clustering.

I hope this was helpful,  I sured learned a lot.

Jason

pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Performance Tuning Document?
Next
From: Matthew Kirkwood
Date:
Subject: Re: Performance Tuning Document?