Thread: Slow select performance despite seemingly reasonable query plan

Slow select performance despite seemingly reasonable query plan

From
David Brain
Date:
Hi,

Some context, we have a _lot_ of data, > 1TB, mostly in 1 'table' -
the 'datatable' in the example below although in order to improve
performance this table is partitioned (by date range) into a number of
partition tables.  Each partition contains up to 20GB of data (tens of
millons of rows), with an additional ~3GB of indexes, all this is
served off a fairly high performance server (8 core 32Gb, with FC
attached SAN storage).  PostgreSQL version is 8.3.5 (running on 64bit
RHEL 5.2)

This has been working reasonably well, however in the last few days
I've been seeing extremely slow performance on what are essentially
fairly simple 'index hitting' selects on this data.  From the host
side I see that the postgres query process is mostly in IO wait,
however there is very little data actually being transferred (maybe
2-4 MB/s) - when a different query (say a select count(*) form
datatable) will yield a sustained 150+ MB/s.  There have been no
configuration changes during this time, although of course the
database has grown as data is added on a daily basis.

I'm not sure of the best way to diagnose this issue - the possible
causes  I can think of are:

1. Problem with random versus sequential reads on storage system.
2. 'Something' with PostgreSQL itself.
3. Problem with the host environment - one suspicion I have here is
that we are >90% full on the storage drives (ext3), I'm not sure if
that is impacting performance.

Any thoughts as to how to procede from here would be very welcome.

Here is an example query plan - looks reasonable to me, seems is
making use of the indexes and the constraint exclusion on the
partition tables:

Nested Loop Left Join  (cost=0.00..6462463.96 rows=1894 width=110)
   ->  Append  (cost=0.00..6453365.66 rows=1894 width=118)
         ->  Seq Scan on datatable sum  (cost=0.00..10.75 rows=1 width=118)
               Filter: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text) AND (NOT CASE WHEN (NOT obsolete)
THEN false ELSE CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN
(obsoletereasonid = 1) THEN true WHEN (obsoletereasonid = 2) THEN true
WHEN (cdrdatasourceid = 1) THEN false ELSE true END END))
         ->  Index Scan using
datatable_20090328_customeriddatapointdate_idx on datatable_20090328
sum  (cost=0.00..542433.51 rows=180 width=49)
               Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
               Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
         ->  Index Scan using
datatable_20090404_customeriddatapointdate_idx on datatable_20090404
sum  (cost=0.00..1322098.74 rows=405 width=48)
               Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
               Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
         ->  Index Scan using
datatable_20090411_customeriddatapointdate_idx on datatable_20090411
sum  (cost=0.00..1612744.29 rows=450 width=48)
               Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
               Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
         ->  Index Scan using
datatable_20090418_customeriddatapointdate_idx on datatable_20090418
sum  (cost=0.00..1641913.58 rows=469 width=49)
               Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
               Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
         ->  Index Scan using
datatable_20090425_customeriddatapointdate_idx on datatable_20090425
sum  (cost=0.00..1334164.80 rows=389 width=49)
               Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
               Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
   ->  Index Scan using pk_cdrextension on cdrextension ext
(cost=0.00..4.77 rows=1 width=8)
         Index Cond: (sum.id = ext.datatableid)


Thanks,

David.

Re: Slow select performance despite seemingly reasonable query plan

From
Scott Mead
Date:
On Thu, May 7, 2009 at 10:14 AM, David Brain <dbrain@bandwidth.com> wrote:
Hi,

Some context, we have a _lot_ of data, > 1TB, mostly in 1 'table' -
the 'datatable' in the example below although in order to improve
performance this table is partitioned (by date range) into a number of
partition tables.  Each partition contains up to 20GB of data (tens of
millons of rows), with an additional ~3GB of indexes, all this is
served off a fairly high performance server (8 core 32Gb, with FC
attached SAN storage).  PostgreSQL version is 8.3.5 (running on 64bit
RHEL 5.2)

This has been working reasonably well, however in the last few days
I've been seeing extremely slow performance on what are essentially
fairly simple 'index hitting' selects on this data.  

   Have you re-indexed any of your partitioned tables?  If you're index is fragmented, you'll be incurring extra I/O's per index access.  Take a look at the pgstattuple contrib for some functions to determine index fragmentation.  You can also take a look at the pg_stat_all_indexes tables.  If your number of tup's fetched is 100 x more than your idx_scans, you *may* consider reindexing.

--Scott

Re: Slow select performance despite seemingly reasonable query plan

From
David Brain
Date:
Hi,

Interesting, for one index on one partition:

idx_scan: 329
idx_tup_fetch: 8905730

So maybe a reindex would help?

David.

On Thu, May 7, 2009 at 10:26 AM, Scott Mead
<scott.lists@enterprisedb.com> wrote:
> On Thu, May 7, 2009 at 10:14 AM, David Brain <dbrain@bandwidth.com> wrote:
>>
>> Hi,
>>
>> Some context, we have a _lot_ of data, > 1TB, mostly in 1 'table' -
>> the 'datatable' in the example below although in order to improve
>> performance this table is partitioned (by date range) into a number of
>> partition tables.  Each partition contains up to 20GB of data (tens of
>> millons of rows), with an additional ~3GB of indexes, all this is
>> served off a fairly high performance server (8 core 32Gb, with FC
>> attached SAN storage).  PostgreSQL version is 8.3.5 (running on 64bit
>> RHEL 5.2)
>>
>> This has been working reasonably well, however in the last few days
>> I've been seeing extremely slow performance on what are essentially
>> fairly simple 'index hitting' selects on this data.
>
>    Have you re-indexed any of your partitioned tables?  If you're index is
> fragmented, you'll be incurring extra I/O's per index access.  Take a look
> at the pgstattuple contrib for some functions to determine index
> fragmentation.  You can also take a look at the pg_stat_all_indexes tables.
> If your number of tup's fetched is 100 x more than your idx_scans, you *may*
> consider reindexing.
>
> --Scott
>
>



--
David Brain
dbrain@bandwidth.com
919.297.1078

Re: Slow select performance despite seemingly reasonable query plan

From
Matthew Wakeling
Date:
On Thu, 7 May 2009, David Brain wrote:
> This has been working reasonably well, however in the last few days
> I've been seeing extremely slow performance on what are essentially
> fairly simple 'index hitting' selects on this data.  From the host
> side I see that the postgres query process is mostly in IO wait,
> however there is very little data actually being transferred (maybe
> 2-4 MB/s) - when a different query (say a select count(*) form
> datatable) will yield a sustained 150+ MB/s.

Has there been a performance *change*, or are you just concerned about a
query which doesn't seem to use "enough" disc bandwidth?

> 1. Problem with random versus sequential reads on storage system.

Certainly random access like this index scan can be extremely slow. 2-4
MB/s is quite reasonable if you're fetching one 8kB block per disc seek -
no more than 200 per second.

> 3. Problem with the host environment - one suspicion I have here is
> that we are >90% full on the storage drives (ext3), I'm not sure if
> that is impacting performance.

One concern I might have with a big setup like that is how big the
database directory has got, and whether directory lookups are taking time.
Check to see if you have the directory_index option enabled on your ext3
filesystem.

Matthew

--
 The third years are wandering about all worried at the moment because they
 have to hand in their final projects. Please be sympathetic to them, say
 things like "ha-ha-ha", but in a sympathetic tone of voice
                                        -- Computer Science Lecturer

Re: Slow select performance despite seemingly reasonable query plan

From
David Brain
Date:
Hi,

Some answers in-line:

>
> Has there been a performance *change*, or are you just concerned about a
> query which doesn't seem to use "enough" disc bandwidth?

Performance has degraded noticeably over the past few days.

> Certainly random access like this index scan can be extremely slow. 2-4 MB/s
> is quite reasonable if you're fetching one 8kB block per disc seek - no more
> than 200 per second.

We have read ahead set pretty aggressively high as the SAN seems to
'like' this, given some testing we did:

/sbin/blockdev --getra /dev/sdb
16384


> One concern I might have with a big setup like that is how big the database
> directory has got, and whether directory lookups are taking time. Check to
> see if you have the directory_index option enabled on your ext3 filesystem.
>

That's a thought, I doubt the option is set (I didn't set it and I
don't _think_ rhel does by default), however the 'base' directory only
contains ~5500 items total, so it's not getting too out of hand.

David

Re: Slow select performance despite seemingly reasonable query plan

From
Nikolas Everett
Date:
Nested Loop Left Join  (cost=0.00..6462463.96 rows=1894 width=110)
  ->  Append  (cost=0.00..6453365.66 rows=1894 width=118)
        ->  Seq Scan on datatable sum  (cost=0.00..10.75 rows=1 width=118)
              Filter: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text) AND (NOT CASE WHEN (NOT obsolete)
THEN false ELSE CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN
(obsoletereasonid = 1) THEN true WHEN (obsoletereasonid = 2) THEN true
WHEN (cdrdatasourceid = 1) THEN false ELSE true END END))
        ->  Index Scan using
datatable_20090328_customeriddatapointdate_idx on datatable_20090328
sum  (cost=0.00..542433.51 rows=180 width=49)
              Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
              Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
        ->  Index Scan using
datatable_20090404_customeriddatapointdate_idx on datatable_20090404
sum  (cost=0.00..1322098.74 rows=405 width=48)
              Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
              Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
        ->  Index Scan using
datatable_20090411_customeriddatapointdate_idx on datatable_20090411
sum  (cost=0.00..1612744.29 rows=450 width=48)
              Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
              Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
        ->  Index Scan using
datatable_20090418_customeriddatapointdate_idx on datatable_20090418
sum  (cost=0.00..1641913.58 rows=469 width=49)
              Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
              Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
        ->  Index Scan using
datatable_20090425_customeriddatapointdate_idx on datatable_20090425
sum  (cost=0.00..1334164.80 rows=389 width=49)
              Index Cond: ((datapointdate >= '2009-04-01
00:00:00'::timestamp without time zone) AND (datapointdate <=
'2009-04-30 23:59:59'::timestamp without time zone) AND
((customerid)::text = 'xxxx'::text))
              Filter: (NOT CASE WHEN (NOT obsolete) THEN false ELSE
CASE WHEN (obsoletereasonid IS NULL) THEN true WHEN (obsoletereasonid
= 1) THEN true WHEN (obsoletereasonid = 2) THEN true WHEN
(cdrdatasourceid = 1) THEN false ELSE true END END)
  ->  Index Scan using pk_cdrextension on cdrextension ext
(cost=0.00..4.77 rows=1 width=8)
        Index Cond: (sum.id = ext.datatableid)


Something doesn't look right.  Why is it doing an index scan on datatable_20090404 when the constraint for that table puts it as entirely in the date range? Shouldn't it just seq scan the partition or use the partition's customerid index?

Re: Slow select performance despite seemingly reasonable query plan

From
Matthew Wakeling
Date:
On Thu, 7 May 2009, David Brain wrote:
>> Certainly random access like this index scan can be extremely slow. 2-4 MB/s
>> is quite reasonable if you're fetching one 8kB block per disc seek - no more
>> than 200 per second.
>
> We have read ahead set pretty aggressively high as the SAN seems to
> 'like' this, given some testing we did:
>
> /sbin/blockdev --getra /dev/sdb
> 16384

Read-ahead won't really help with completely random access.

I think a much more interesting line of enquiry will be trying to work out
what has changed, and why it was fast before.

How much of the data you're accessing are you expecting to be in the OS
cache?

Is the table you're index scanning on ordered at all? Could that have
changed recently?

> That's a thought, I doubt the option is set (I didn't set it and I
> don't _think_ rhel does by default), however the 'base' directory only
> contains ~5500 items total, so it's not getting too out of hand.

I think quite a few systems do set it by default now.

Matthew

--
 Me... a skeptic?  I trust you have proof?

Re: Slow select performance despite seemingly reasonable query plan

From
Nikolas Everett
Date:


On Thu, May 7, 2009 at 11:19 AM, Matthew Wakeling <matthew@flymine.org> wrote:
On Thu, 7 May 2009, David Brain wrote:
Certainly random access like this index scan can be extremely slow. 2-4 MB/s
is quite reasonable if you're fetching one 8kB block per disc seek - no more
than 200 per second.

We have read ahead set pretty aggressively high as the SAN seems to
'like' this, given some testing we did:

/sbin/blockdev --getra /dev/sdb
16384

Read-ahead won't really help with completely random access.

Thats a shame because it would be really nice to get the entire index into shared memory or OS cache.  Most of the time queries are on data in the past few months.  All of the indexes in the past few months should fit in cache.

Did something happen to get those indexes flushed from the cache?  Were they in the cache before?


I think a much more interesting line of enquiry will be trying to work out what has changed, and why it was fast before.

How much of the data you're accessing are you expecting to be in the OS cache?

Is the table you're index scanning on ordered at all? Could that have changed recently?

I wrote the application that puts data in that table.  Its sort of ordered by that timestamp.  Every five minutes it adds rows in no particular order that need to be added.  The rows that need to be added every five minutes are ordered by another timestamp that is correlated to but not the same as the indexed timestamp.
 


That's a thought, I doubt the option is set (I didn't set it and I
don't _think_ rhel does by default), however the 'base' directory only
contains ~5500 items total, so it's not getting too out of hand.

I think quite a few systems do set it by default now.

Matthew

--
Me... a skeptic?  I trust you have proof?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Slow select performance despite seemingly reasonable query plan

From
Laurent Wandrebeck
Date:
2009/5/7 David Brain <dbrain@bandwidth.com>:
> Hi,
Hi,
>
> Some answers in-line:
>
>>
>> Has there been a performance *change*, or are you just concerned about a
>> query which doesn't seem to use "enough" disc bandwidth?
>
> Performance has degraded noticeably over the past few days.
>
>> Certainly random access like this index scan can be extremely slow. 2-4 MB/s
>> is quite reasonable if you're fetching one 8kB block per disc seek - no more
>> than 200 per second.
>
> We have read ahead set pretty aggressively high as the SAN seems to
> 'like' this, given some testing we did:
>
> /sbin/blockdev --getra /dev/sdb
> 16384
>
>
>> One concern I might have with a big setup like that is how big the database
>> directory has got, and whether directory lookups are taking time. Check to
>> see if you have the directory_index option enabled on your ext3 filesystem.
>>
>
> That's a thought, I doubt the option is set (I didn't set it and I
> don't _think_ rhel does by default), however the 'base' directory only
> contains ~5500 items total, so it's not getting too out of hand.
default rhel ext3 options are (in 4.x and 5.x) :
Filesystem features:      has_journal ext_attr resize_inode dir_index
filetype needs_recovery sparse_super large_file
See tune2fs -l /dev/sdXY
Laurent.