Re: Small Queries Really Fast, Large Queries Really Slow... - Mailing list pgsql-performance

From tv@fuzzy.cz
Subject Re: Small Queries Really Fast, Large Queries Really Slow...
Date
Msg-id 3376.85.160.2.47.1277378404.squirrel@sq.gransy.com
Whole thread Raw
In response to Small Queries Really Fast, Large Queries Really Slow...  (Tom Wilcox <hungrytom@gmail.com>)
List pgsql-performance
> Any suggestions on what I can do to speed things up? I presume if I turn
> off
> Sequential Scan then it might default to Index Scan.. Is there anything
> else?
>
> Cheers,
> Tom

Well, I doubt turning off the sequential scan will improve the performance
in this case - actually the first case (running 400 sec) uses an index
scan, while the 'fast' one uses sequential scan.

Actually I'd try exactly the oposite - disabling the index scan, i.e.
forcing it to use sequential scan in the first case. You're selecting
about 4% of the rows, but we don't know how 'spread' are those rows
through the table. It might happen PostgreSQL actually has to read all the
blocks of the table.

This might be improved by clustering - create and index on the
'match_data_id' colunm and then run

CLUSTER match_data_id_idx ON match_data;

This will sort the table accoring to match_data_id column, which should
improve the performance. But it won't last forever - it degrades through
time, so you'll have to perform clustering once a while (and it locks the
table, so be careful).

How large is the table anyway? How many rows / pages are there? Try
something like this

SELECT reltuples, relpages FROM pg_class WHERE relname = 'match_data';

Multiply the blocks by 8k and you'll get the occupied space. How much is
it? How much memory (shared_buffers) is there?

You could try partitioning accoring to the match_data_id column, but there
are various disadvantages related to foreign keys etc. and it's often a
major change in the application, so I'd consider other solutions first.

BTW I have no experience with running PostgreSQL inside a Virtual Box VM,
so it might be another source of problems. I do remember we had some
serious problems with I/O (network and disks) when running vmware, but it
was a long time ago and now it works fine. But maybe this the root cause?
Can you run dstat / vmstat / iostat or something like that in the host OS
to see which of the resources is causing problems (if any)?

Tomas


pgsql-performance by date:

Previous
From: Tom Wilcox
Date:
Subject: Small Queries Really Fast, Large Queries Really Slow...
Next
From: Robert Haas
Date:
Subject: Re: PostgreSQL as a local in-memory cache