Thread: Some very weird behaviour....

Some very weird behaviour....

From
Chris Bowlby
Date:
Hi All,

 I'm sure some of you know me from previous questions on other lists,
but this one has myself and Marc completely stumped. We've got a
database that has about 89 Million rows, under PostgreSQL 7.3.3 on a
dual PIII 1.2 with 4 GBytes of RAM on a 5 disk RAID 5 array. The dataset
itself is about 26+ GBYtes in size, all of it in the one table.

 To give you some perspective on the size of the dataset and the
performance level we are hitting, here are some "good" results based on
some explains:

jnlstats=# explain analyze select count(*) from some_table where
some_time::date='2003-05-21';

QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=3013.55..3013.55 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table
(cost=0.00..1793446.02 rows=446531 width=0) (actual time=48.40..2721.26
rows=249837 loops=1)
         Index Cond: ((some_time)::date = '2003-05-21'::date)
 Total runtime: 3015.02 msec
(4 rows)

jnlstats=# explain analyze select count(*) from stats_raw where
some_time::date='2003-05-21';
                                                                   QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=1401.23..1401.23 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table
(cost=0.00..1793446.02 rows=446531 width=0) (actual time=0.50..1118.92
rows=249837 loops=1)
         Index Cond: ((some_time)::date = '2003-05-21'::date)
 Total runtime: 1401.42 msec

 There are about 249837 items that the query is identifying as valid
results and the results range between 1-1.4 seconds over ten runs with
the initial query taking 3 seconds, this average is how 90% of the
queries resopond, but we've got several peaks that we can not explain in
any way. For instance:

jnlstats=# explain analyze select count(*) from some_table where
some_time::date='2003-05-26';

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=307025.65..307025.65 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table
(cost=0.00..1793446.02 rows=446531 width=0) (actual
time=51.05..306256.93 rows=374540 loops=1)
         Index Cond: ((some_time)::date = '2003-05-26'::date)
 Total runtime: 307025.81 msec
(4 rows)

jnlstats=# explain analyze select count(*) from some_table where
some_time::date='2003-05-26';

QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=10837.86..10837.86 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table
(cost=0.00..1793446.02 rows=446531 width=0) (actual time=1.01..10304.78
rows=374540 loops=1)
         Index Cond: ((some_time)::date = '2003-05-26'::date)
 Total runtime: 10838.04 msec

 The total number of items counted is 374540 items, so not too much more
then the previous query, but the 300 second runtime was unexpected (we
were expecting ~4-5 seconds and then ~1-2 seconds for the caches
results. I have 5 other dates that all exhibit this information,but it's
ONLY those dates that run that slow and the one I presented above here
is the largest of them all. The database server is configured with a 5
MByte shared mamory buffer, but even a larger shared memory buffer does
not help (we have had it set to 800 MBytes before). The disk is getting
hit the heviest durring that last query, with iostat results being:

      tty             da0              da1              da2
cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in
id
   0   25 12.22   1  0.02   7.68   0  0.00   7.68   0  0.00   0  0  0  0
99
   4 3758  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   6  0 38  0
56
   0  151  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   9  0 43  0
48
   0  148  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  10  0 40  0
49
   0  153  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  10  0 40  0
49
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  10  0 40  1
49
   0  150  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   9  0 42  0
49
   0  153  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   9  0 41  1
49
   0  149  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   8  0 45  0
48
   0  148  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  11  0 41  0
48
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  12  0 38  0
50
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  14  0 37  0
49
   0  152 16.00  20  0.31   8.00   9  0.07   8.00  16  0.12   0  0  1  0
98
   0  152  2.00   1  0.00   0.00   0  0.00   1.00   2  0.00   0  0  0  0
100
   0  152 11.33   6  0.07   0.00   0  0.00   1.00   2  0.00   0  0  0  0
99
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   1  0  0  0
99
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   1  0  0  0
99
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0
100
   0  154  0.00   0  0.00   5.89  18  0.10   0.00   0  0.00   0  0  1  0
98

 The database is vacuumed about once every hour, the pg_stats table
shows:

jnlstats=# select tablename, attname, n_distinct from pg_stats where
tablename = 'some_table';
 tablename |     attname      | n_distinct
-----------+------------------+------------
 some_table | some_time         |   -0.24305

 So the column is very distinct (assuming that's what a negative number
means). What I'm looking for is any form of explaination that might be
casusing those spikes, but at the same time a possible solution that
might help me bring it down some...
--
Chris Bowlby <excalibur@hub.org>
Hub.Org Networking Services


Re: Some very weird behaviour....

From
Rod Taylor
Date:
>  To give you some perspective on the size of the dataset and the
> performance level we are hitting, here are some "good" results based on
> some explains:

Before Tom jumps in taking all the fun out of trying to solve it...


The estimates in the slow queries seem perfectly reasonable.  In fact,
the cost estimates of both the slow and fast queries are the same which
is what would be expected if all of the data was distributed evenly
amongst the table.

Given it's a date, I would guess that the data is generally inserted
into the table in an order following the date but for some reason those
'high' dates have their data distributed more evenly amongst the table.
Clustered data will have fewer disk seeks and deal with fewer pages of
information in general which makes for a much faster query.  Distributed
data will have to pull out significantly more information from the disk,
throwing most of it away.

I would guess that sometime on 2002-05-25 someone did a bit of data
cleaning (deleting records).  Next day the free space map had entries
available in various locations within the table, and used them rather
than appending to the end.  With 89 Million records with date being
significant, I'm guessing there aren't very many modifications or
deletes on it.

So.. How to solve the problem? If this is the type of query that occurs
most often, you do primarily inserts, and the inserts are generally
created following date, cluster the table by index "some_table_ix_0".
The clustering won't degrade very much since that is how you naturally
insert the data.

Attachment

Re: Some very weird behaviour....

From
Chris Bowlby
Date:
On Wed, 2003-07-09 at 14:42, Rod Taylor wrote:

 Clustering definatly helped with that case, and appears to have helped
with all of the dates I have had high execution times for... thanks for
the tip..

> >  To give you some perspective on the size of the dataset and the
> > performance level we are hitting, here are some "good" results based on
> > some explains:
>
> Before Tom jumps in taking all the fun out of trying to solve it...
>
>
> The estimates in the slow queries seem perfectly reasonable.  In fact,
> the cost estimates of both the slow and fast queries are the same which
> is what would be expected if all of the data was distributed evenly
> amongst the table.
>
> Given it's a date, I would guess that the data is generally inserted
> into the table in an order following the date but for some reason those
> 'high' dates have their data distributed more evenly amongst the table.
> Clustered data will have fewer disk seeks and deal with fewer pages of
> information in general which makes for a much faster query.  Distributed
> data will have to pull out significantly more information from the disk,
> throwing most of it away.
>
> I would guess that sometime on 2002-05-25 someone did a bit of data
> cleaning (deleting records).  Next day the free space map had entries
> available in various locations within the table, and used them rather
> than appending to the end.  With 89 Million records with date being
> significant, I'm guessing there aren't very many modifications or
> deletes on it.
>
> So.. How to solve the problem? If this is the type of query that occurs
> most often, you do primarily inserts, and the inserts are generally
> created following date, cluster the table by index "some_table_ix_0".
> The clustering won't degrade very much since that is how you naturally
> insert the data.
--
Chris Bowlby <excalibur@hub.org>
Hub.Org Networking Services