Thread: PG8.2.1 choosing slow seqscan over idx scan

PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

16 January 2007, 17:23:22

Running PostgreSQL 8.2.1 on Win32.   The query planner is choosing a seq
scan over index scan even though index scan is faster (as shown by
disabling seqscan).  Table is recently analyzed and row count estimates
seem to be in the ballpark.

Another tidbit - I haven't done a "vacuum full" ever, although I "vacuum
analyze" regularly (and autovacuum).  I recently noticed that the PG
data drive is 40% fragmented (NTFS).  Could that be making the seqscan
slower than it should be?  Regardless of the fragmentations affect on
performance, is the query planner making a good decision here?


SOME CONFIGURATION PARAMS
effective_cache_size=1000MB
random_page_cost=3
default_statistics_target=50
shared_buffers=400MB
temp_buffers=10MB
work_mem=10MB
checkpoint_segments=12


QUERY
select merchant_dim_id,
       dcms_dim_id,
       sum(success) as num_success,
       sum(failed) as num_failed,
       count(*) as total_transactions,
       (sum(success) * 1.0 / count(*)) as success_rate
from transaction_facts
where transaction_date >= '2007-1-16'
and transaction_date < '2007-1-16 15:20'
group by merchant_dim_id, dcms_dim_id;


EXPLAIN ANALYZE (enable_seqscan=true)
HashAggregate  (cost=339573.01..340089.89 rows=15904 width=16) (actual
time=140606.593..140650.573 rows=10549 loops=1)
  ->  Seq Scan on transaction_facts  (cost=0.00..333928.25 rows=322558
  width=16) (actual time=19917.957..140036.910 rows=347434 loops=1)
        Filter: ((transaction_date >= '2007-01-16 00:00:00'::timestamp
        without time zone) AND (transaction_date < '2007-01-16
        15:20:00'::timestamp without time zone))
Total runtime: 140654.813 ms


EXPLAIN ANALYZE (enable_seqscan=false)
HashAggregate  (cost=379141.53..379658.41 rows=15904 width=16) (actual
time=3720.838..3803.748 rows=10549 loops=1)
  ->  Bitmap Heap Scan on transaction_facts  (cost=84481.80..373496.76
  rows=322558 width=16) (actual time=244.568..3133.741 rows=347434
  loops=1)
        Recheck Cond: ((transaction_date >= '2007-01-16
        00:00:00'::timestamp without time zone) AND (transaction_date <
        '2007-01-16 15:20:00'::timestamp without time zone))
        ->  Bitmap Index Scan on transaction_facts_transaction_date_idx
        (cost=0.00..84401.16 rows=322558 width=0) (actual
        time=241.994..241.994 rows=347434 loops=1)
              Index Cond: ((transaction_date >= '2007-01-16
              00:00:00'::timestamp without time zone) AND
              (transaction_date < '2007-01-16 15:20:00'::timestamp
              without time zone))
Total runtime: 3810.795 ms

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Tom Lane

Date:

16 January 2007, 17:39:28

"Jeremy Haile" <jhaile@fastmail.fm> writes:
> Running PostgreSQL 8.2.1 on Win32.   The query planner is choosing a seq
> scan over index scan even though index scan is faster (as shown by
> disabling seqscan).  Table is recently analyzed and row count estimates
> seem to be in the ballpark.

Try reducing random_page_cost a bit.  Keep in mind that you are probably
measuring a fully-cached situation here, if you repeated the test case.
If your database fits into memory reasonably well then that's fine and
you want to optimize for that case ... but otherwise you may find
yourself pessimizing the actual behavior.

            regards, tom lane

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

16 January 2007, 18:21:13

Thanks Tom!  Reducing random_page_cost to 2 did the trick for this
query.  It now favors the index scan.

Even if this is a cached situation, I wouldn't expect a difference of 3
min vs 3 seconds.

Even if unrelated, do you think disk fragmentation would have negative
effects?  Is it worth trying to defragment the drive on a regular basis
in Windows?

Jeremy Haile

On Tue, 16 Jan 2007 16:39:07 -0500, "Tom Lane" <tgl@sss.pgh.pa.us> said:
> "Jeremy Haile" <jhaile@fastmail.fm> writes:
> > Running PostgreSQL 8.2.1 on Win32.   The query planner is choosing a seq
> > scan over index scan even though index scan is faster (as shown by
> > disabling seqscan).  Table is recently analyzed and row count estimates
> > seem to be in the ballpark.
>
> Try reducing random_page_cost a bit.  Keep in mind that you are probably
> measuring a fully-cached situation here, if you repeated the test case.
> If your database fits into memory reasonably well then that's fine and
> you want to optimize for that case ... but otherwise you may find
> yourself pessimizing the actual behavior.
>
>             regards, tom lane

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Chad Wagner"

Date:

16 January 2007, 18:45:13

On 1/16/07, Jeremy Haile <jhaile@fastmail.fm> wrote:

Even if unrelated, do you think disk fragmentation would have negative
effects? Is it worth trying to defragment the drive on a regular basis
in Windows?

Out of curiosity, is this table heavily updated or deleted from? Perhaps there is an unfavorable "correlation" between the btree and data? Can you dump the results of

select attname, null_frac, avg_width, n_distinct, correlation from pg_stats where tablename = 'transaction_facts'

--
Chad
http://www.postgresqlforums.com/

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

16 January 2007, 23:01:54

Hey Chad,

The table is heavily inserted and deleted from.  Recently I had done a
very large delete.

Here is the results of the query you sent me: (sorry it's hard to read)

"dcms_dim_id";0;4;755;-0.00676181
"transaction_fact_id";0;4;-1;-0.194694
"failed";0;4;2;0.964946
"van16";0;23;145866;0.00978649
"vendor_response";0.9942;43;9;0.166527
"transaction_id";0;4;-1;-0.199583
"transaction_date";0;8;172593;-0.194848
"serial_number";0.0434667;16;53311;0.0713039
"merchant_dim_id";0;4;105;0.299335
"comment";0.0052;29;7885;0.0219167
"archived";0;1;2;0.84623
"response_code";0.9942;4;3;0.905409
"transaction_source";0;4;2;0.983851
"location_dim_id";0;4;86;0.985384
"success";0;4;2;0.981072

Just curious - what does that tell us?

Jeremy Haile

On Tue, 16 Jan 2007 17:44:53 -0500, "Chad Wagner"
<chad.wagner@gmail.com> said:
> On 1/16/07, Jeremy Haile <jhaile@fastmail.fm> wrote:
> >
> > Even if unrelated, do you think disk fragmentation would have negative
> > effects?  Is it worth trying to defragment the drive on a regular basis
> > in Windows?
> >
>
> Out of curiosity, is this table heavily updated or deleted from?  Perhaps
> there is an unfavorable "correlation" between the btree and data?  Can
> you
> dump the results of
>
> select attname, null_frac, avg_width, n_distinct, correlation from
> pg_stats
> where tablename = 'transaction_facts'
>
>
>
>
> --
> Chad
> http://www.postgresqlforums.com/

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Chad Wagner"

Date:

17 January 2007, 00:53:55

On 1/16/07, Jeremy Haile <jhaile@fastmail.fm> wrote:

The table is heavily inserted and deleted from. Recently I had done a
very large delete.

That's what I suspected.

Here is the results of the query you sent me: (sorry it's hard to read)

"transaction_date";0;8;172593;-0.194848

Just curious - what does that tell us?

Based on the explain plan you posted earlier we learned the optimizer believed the query would return 322558 rows (and it was reasonably accurate, too) for a relatively small time frame (15 hours and 20 minutes).

-> Seq Scan on transaction_facts (cost=0.00..333928.25 rows=322558
width=16) (actual time=19917.957..140036.910 rows=347434 loops=1)

Based on the information you just posted, the average row length is 156 bytes.

347434 rows * 156 bytes = 52MB (reasonable it could be held in your shared buffers, which makes Tom's suggestion very plausible, the index scan may not be cheaper -- because it is all cached)

The estimation of cost when you are requesting a "range" of data will involve the "correlation" factor, the correlation is defined as:

"Statistical correlation between physical row ordering and logical ordering of the column values. This ranges from -1 to +1. When the value is near -1 or +1, an index scan on the column will be estimated to be cheaper than when it is near zero, due to reduction of random access to the disk. (This column is NULL if the column data type does not have a < operator.)"

Which means that as correlation approaches zero (which it is -0.19, I would call that zero) it represents that the physical ordering of the data (in the data files) is such that a range scan of the btree index would result in many scatter reads (which are slow). So the optimizer considers whether a "scattered" read will be cheaper than a sequential scan based on a few factors: a) correlation [for ranges] b) size of table c) estimated cardinality [what does it think you are going to get back]. Based on those factors it may choose either access path (sequential or index).

One of the reasons why the sequential scan is slower is because the optimizer doesn't know the data you are requesting is sitting in the cache (and it is VERY unlikely you have the entire table in cache, unless it is a heavily used table or very small table, which it's probably not). So once it decides a sequential scan is the best plan, it starts chugging away at the disk and pulling in most of the pages in the table and throws away the pages that do not meet your criteria.

The index scan is quicker (may be bogus, as Tom suggested) because the it starts chugging away at the index and finds that many of the pages you are interested in are cached (but it didn't know, you just got lucky!).

In practice, once you start pulling in 15% or more of the table it is often cheaper just to read the entire table in, rather than scatter reads + double I/O. Remember that an index access means I have to read the index PLUS the table from disk, and a sequential scan just throws away the index and reads the table from disk.

I would suggest running explain analyze after restarting the database (that may not be even good enough, because a large portion of the data file may be in the OS's buffer cache, hrm -- reboot? unmount?) and see how cheap that index access path is.

One thing to be careful of here is that you really need to consider what is the primary use of the table, and what are the major queries you will be launching against it. But you could improve the correlation by rebuilding the table ordered by the transaction_date column, but it may screw up other range scans. Another option is partitioning. I wouldn't do any of this stuff, until you find out the last tweak you made still holds true, give it a few days, perhaps test it after a clean reboot of the server.

Let me know if any of this is inaccurate folks, as I certainly don't profess to be an expert on the internals of PostgreSQL, but I believe it is accurate based on my prior experiences with other database products. :)

--
Chad
http://www.postgresqlforums.com/

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Scott Marlowe

Date:

17 January 2007, 01:47:29

On Tue, 2007-01-16 at 21:58 -0500, Jeremy Haile wrote:
> Hey Chad,
>
> The table is heavily inserted and deleted from.  Recently I had done a
> very large delete.

I still keep wondering if this table is bloated with dead tuples.  Even
if you vacuum often if there's a connection with an idle transaction,
the tuples can't be reclaimed and the table would continue to grow.

Anyway, what does

vacuum analyze tablename

say (where tablename is, of course, the name of the table we're looking
at)?  Pay particular attention to DETAIL statements.

Assuming the table's NOT bloated, you may do well to increase the
effective_cache_size, which doesn't allocate anything, but just tells
the query planner about how big your operating systems file cache is as
regards postgresql.  It's a bit of a course setting, i.e. you can make
rather large changes to it without blowing things up.  If you've got a
couple gigs on your machine, try setting it to something like 512MB or
so.

If your table is bloating, and you don't have idle transactions hanging
of the database, it could be that your fsm settings are too low.

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 10:38:24

> I still keep wondering if this table is bloated with dead tuples.  Even
> if you vacuum often if there's a connection with an idle transaction,
> the tuples can't be reclaimed and the table would continue to grow.

I used to vacuum once an hour, although I've switched it to autovacuum
now.  It definitely could be bloated with dead tuples.  I'll paste the
"vacuum analyze verbose" output at the bottom of this e-mail.  Would a
vacuum full be a good idea?


> Assuming the table's NOT bloated, you may do well to increase the
> effective_cache_size, which doesn't allocate anything,
<snip>
> try setting it to something like 512MB or so.

It's currently set to 1000MB.


> If your table is bloating, and you don't have idle transactions hanging
> of the database, it could be that your fsm settings are too low.

fsm is currently set to 2000000.  Is there any harm in setting it too
high? =)

Here's the vacuum analyze verbose output:

INFO:  vacuuming "public.transaction_facts"
INFO:  scanned index "transaction_facts_pkey" to remove 759969 row
versions
DETAIL:  CPU 7.20s/2.31u sec elapsed 315.31 sec.
INFO:  scanned index "transaction_facts_dcms_dim_id_idx" to remove
759969 row versions
DETAIL:  CPU 1.29s/2.15u sec elapsed 146.98 sec.
INFO:  scanned index "transaction_facts_merchant_dim_id_idx" to remove
759969 row versions
DETAIL:  CPU 1.10s/2.10u sec elapsed 126.09 sec.
INFO:  scanned index "transaction_facts_transaction_date_idx" to remove
759969 row versions
DETAIL:  CPU 1.65s/2.40u sec elapsed 259.25 sec.
INFO:  scanned index "transaction_facts_transaction_id_idx" to remove
759969 row versions
DETAIL:  CPU 7.48s/2.85u sec elapsed 371.98 sec.
INFO:  scanned index "transaction_facts_product_date_idx" to remove
759969 row versions
DETAIL:  CPU 2.32s/2.10u sec elapsed 303.83 sec.
INFO:  scanned index "transaction_facts_merchant_product_date_idx" to
remove 759969 row versions
DETAIL:  CPU 2.48s/2.31u sec elapsed 295.19 sec.
INFO:  scanned index "transaction_facts_merchant_date_idx" to remove
759969 row versions
DETAIL:  CPU 8.10s/3.35u sec elapsed 398.73 sec.
INFO:  scanned index "transaction_facts_success_idx" to remove 759969
row versions
DETAIL:  CPU 5.01s/2.84u sec elapsed 192.73 sec.
INFO:  scanned index "transaction_facts_failed_idx" to remove 759969 row
versions
DETAIL:  CPU 1.03s/1.90u sec elapsed 123.00 sec.
INFO:  scanned index "transaction_facts_archived_idx" to remove 759969
row versions
DETAIL:  CPU 1.03s/1.39u sec elapsed 104.42 sec.
INFO:  scanned index "transaction_facts_response_code_idx" to remove
759969 row versions
DETAIL:  CPU 0.75s/2.17u sec elapsed 36.71 sec.
INFO:  scanned index "transaction_facts_transaction_source_idx" to
remove 759969 row versions
DETAIL:  CPU 0.60s/1.75u sec elapsed 42.29 sec.
INFO:  scanned index "transaction_facts_transaction_id_source_idx" to
remove 759969 row versions
DETAIL:  CPU 1.14s/1.84u sec elapsed 44.75 sec.
INFO:  "transaction_facts": removed 759969 row versions in 14360 pages
DETAIL:  CPU 0.57s/0.23u sec elapsed 45.28 sec.
INFO:  index "transaction_facts_pkey" now contains 2274280 row versions
in 152872 pages
DETAIL:  759969 index row versions were removed.
134813 index pages have been deleted, 134813 are currently reusable.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  index "transaction_facts_dcms_dim_id_idx" now contains 2274280
row versions in 85725 pages
DETAIL:  759323 index row versions were removed.
75705 index pages have been deleted, 73721 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_merchant_dim_id_idx" now contains
2274280 row versions in 80023 pages
DETAIL:  759969 index row versions were removed.
71588 index pages have been deleted, 69210 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_date_idx" now contains
2274280 row versions in 144196 pages
DETAIL:  759969 index row versions were removed.
126451 index pages have been deleted, 126451 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_id_idx" now contains 2274280
row versions in 150529 pages
DETAIL:  759969 index row versions were removed.
130649 index pages have been deleted, 130649 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_product_date_idx" now contains 2274280
row versions in 202248 pages
DETAIL:  759969 index row versions were removed.
174652 index pages have been deleted, 174652 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_merchant_product_date_idx" now contains
2274280 row versions in 202997 pages
DETAIL:  759969 index row versions were removed.
175398 index pages have been deleted, 175398 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_merchant_date_idx" now contains 2274280
row versions in 203561 pages
DETAIL:  759969 index row versions were removed.
175960 index pages have been deleted, 175960 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_success_idx" now contains 2274280 row
versions in 78237 pages
DETAIL:  759969 index row versions were removed.
70239 index pages have been deleted, 67652 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_failed_idx" now contains 2274280 row
versions in 78230 pages
DETAIL:  759969 index row versions were removed.
70231 index pages have been deleted, 67665 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_archived_idx" now contains 2274280 row
versions in 72943 pages
DETAIL:  759969 index row versions were removed.
64962 index pages have been deleted, 62363 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_response_code_idx" now contains 2274280
row versions in 16918 pages
DETAIL:  759969 index row versions were removed.
8898 index pages have been deleted, 6314 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_source_idx" now contains
2274280 row versions in 14235 pages
DETAIL:  759969 index row versions were removed.
6234 index pages have been deleted, 3663 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_id_source_idx" now contains
2274280 row versions in 18053 pages
DETAIL:  759969 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "transaction_facts": found 759969 removable, 2274280 nonremovable
row versions in 308142 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 15710471 unused item pointers.
266986 pages contain useful free space.
0 pages are entirely empty.
CPU 58.00s/35.59u sec elapsed 3240.94 sec.
INFO:  analyzing "public.transaction_facts"
INFO:  "transaction_facts": scanned 15000 of 308142 pages, containing
113476 live rows and 0 dead rows; 15000 rows in sample, 2331115
estimated total rows

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Shoaib Mir"

Date:

17 January 2007, 10:53:53

A good idea here will be to first do a VACUUM FULL and then keep the Autovacuum settings you want.

-----------------
Shoaib Mir
EnterpriseDB (www.enterprisedb.com)

On 1/17/07, Jeremy Haile <jhaile@fastmail.fm> wrote:

> I still keep wondering if this table is bloated with dead tuples.  Even
> if you vacuum often if there's a connection with an idle transaction,
> the tuples can't be reclaimed and the table would continue to grow.

I used to vacuum once an hour, although I've switched it to autovacuum
now.  It definitely could be bloated with dead tuples.  I'll paste the
"vacuum analyze verbose" output at the bottom of this e-mail.  Would a
vacuum full be a good idea?

> Assuming the table's NOT bloated, you may do well to increase the
> effective_cache_size, which doesn't allocate anything,
<snip>
> try setting it to something like 512MB or so.

It's currently set to 1000MB.

> If your table is bloating, and you don't have idle transactions hanging
> of the database, it could be that your fsm settings are too low.

fsm is currently set to 2000000.  Is there any harm in setting it too
high? =)

Here's the vacuum analyze verbose output:

INFO:  vacuuming "public.transaction_facts"
INFO:  scanned index "transaction_facts_pkey" to remove 759969 row
versions
DETAIL:  CPU 7.20s/2.31u sec elapsed 315.31 sec.
INFO:  scanned index "transaction_facts_dcms_dim_id_idx" to remove
759969 row versions
DETAIL:  CPU 1.29s/2.15u sec elapsed 146.98 sec.
INFO:  scanned index "transaction_facts_merchant_dim_id_idx" to remove
759969 row versions
DETAIL:  CPU 1.10s/2.10u sec elapsed 126.09 sec.
INFO:  scanned index "transaction_facts_transaction_date_idx" to remove
759969 row versions
DETAIL:  CPU 1.65s/2.40u sec elapsed 259.25 sec.
INFO:  scanned index "transaction_facts_transaction_id_idx" to remove
759969 row versions
DETAIL:  CPU 7.48s/2.85u sec elapsed 371.98 sec.
INFO:  scanned index "transaction_facts_product_date_idx" to remove
759969 row versions
DETAIL:  CPU 2.32s/2.10u sec elapsed 303.83 sec.
INFO:  scanned index "transaction_facts_merchant_product_date_idx" to
remove 759969 row versions
DETAIL:  CPU 2.48s/2.31u sec elapsed 295.19 sec.
INFO:  scanned index "transaction_facts_merchant_date_idx" to remove
759969 row versions
DETAIL:  CPU 8.10s/3.35u sec elapsed 398.73 sec.
INFO:  scanned index "transaction_facts_success_idx" to remove 759969
row versions
DETAIL:  CPU 5.01s/2.84u sec elapsed 192.73 sec.
INFO:  scanned index "transaction_facts_failed_idx" to remove 759969 row
versions
DETAIL:  CPU 1.03s/1.90u sec elapsed 123.00 sec.
INFO:  scanned index "transaction_facts_archived_idx" to remove 759969
row versions
DETAIL:  CPU 1.03s/1.39u sec elapsed 104.42 sec.
INFO:  scanned index "transaction_facts_response_code_idx" to remove
759969 row versions
DETAIL:  CPU 0.75s/2.17u sec elapsed 36.71 sec.
INFO:  scanned index "transaction_facts_transaction_source_idx" to
remove 759969 row versions
DETAIL:  CPU 0.60s/1.75u sec elapsed 42.29 sec.
INFO:  scanned index "transaction_facts_transaction_id_source_idx" to
remove 759969 row versions
DETAIL:  CPU 1.14s/1.84u sec elapsed 44.75 sec.
INFO:  "transaction_facts": removed 759969 row versions in 14360 pages
DETAIL:  CPU 0.57s/0.23u sec elapsed 45.28 sec.
INFO:  index "transaction_facts_pkey" now contains 2274280 row versions
in 152872 pages
DETAIL:  759969 index row versions were removed.
134813 index pages have been deleted, 134813 are currently reusable.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  index "transaction_facts_dcms_dim_id_idx" now contains 2274280
row versions in 85725 pages
DETAIL:  759323 index row versions were removed.
75705 index pages have been deleted, 73721 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_merchant_dim_id_idx" now contains
2274280 row versions in 80023 pages
DETAIL:  759969 index row versions were removed.
71588 index pages have been deleted, 69210 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_date_idx" now contains
2274280 row versions in 144196 pages
DETAIL:  759969 index row versions were removed.
126451 index pages have been deleted, 126451 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_id_idx" now contains 2274280
row versions in 150529 pages
DETAIL:  759969 index row versions were removed.
130649 index pages have been deleted, 130649 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_product_date_idx" now contains 2274280
row versions in 202248 pages
DETAIL:  759969 index row versions were removed.
174652 index pages have been deleted, 174652 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_merchant_product_date_idx" now contains
2274280 row versions in 202997 pages
DETAIL:  759969 index row versions were removed.
175398 index pages have been deleted, 175398 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_merchant_date_idx" now contains 2274280
row versions in 203561 pages
DETAIL:  759969 index row versions were removed.
175960 index pages have been deleted, 175960 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_success_idx" now contains 2274280 row
versions in 78237 pages
DETAIL:  759969 index row versions were removed.
70239 index pages have been deleted, 67652 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_failed_idx" now contains 2274280 row
versions in 78230 pages
DETAIL:  759969 index row versions were removed.
70231 index pages have been deleted, 67665 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_archived_idx" now contains 2274280 row
versions in 72943 pages
DETAIL:  759969 index row versions were removed.
64962 index pages have been deleted, 62363 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_response_code_idx" now contains 2274280
row versions in 16918 pages
DETAIL:  759969 index row versions were removed.
8898 index pages have been deleted, 6314 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_source_idx" now contains
2274280 row versions in 14235 pages
DETAIL:  759969 index row versions were removed.
6234 index pages have been deleted, 3663 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "transaction_facts_transaction_id_source_idx" now contains
2274280 row versions in 18053 pages
DETAIL:  759969 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "transaction_facts": found 759969 removable, 2274280 nonremovable
row versions in 308142 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 15710471 unused item pointers.
266986 pages contain useful free space.
0 pages are entirely empty.
CPU 58.00s/35.59u sec elapsed 3240.94 sec.
INFO:  analyzing "public.transaction_facts"
INFO:  "transaction_facts": scanned 15000 of 308142 pages, containing
113476 live rows and 0 dead rows; 15000 rows in sample, 2331115
estimated total rows

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 11:12:13

Thanks for the great info Chad.  I'm learning a lot from this thread!

> 347434 rows * 156 bytes = 52MB (reasonable it could be held in your
> shared buffers, which makes Tom's suggestion very plausible, the
> index scan may not be cheaper -- because it is all cached)

Maybe - I tried running the same query for an older time range that is
less likely to be cached.  The index scan took longer than my previous
example, but still only took 16 seconds, compared to the 87 seconds
required to seqscan the table.  When I can, I'll restart the machine and
run a comparison again to get a "pure" test.


> One of the reasons why the sequential scan is slower is because the
> optimizer doesn't know the data you are requesting is sitting in the
> cache (and it is VERY unlikely you have the entire table in cache,
> unless it is a heavily used table or very small table, which it's probably
> not).

This is a large table (3 million rows).  Rows are typically inserted in
date order, although large numbers of rows are deleted every night.
Basically, this table contains a list of transactions in a rolling time
window.  So inserts happen throughout the day, and then a day's worth of
old rows are deleted every night.  The most accessed rows are going to
be today's rows, which is a small subset of the overall table. (maybe
14%)

> One thing to be careful of here is that you really need to consider what
> is  the primary use of the table, and what are the major queries you will be
> launching against it.  But you could improve the correlation by
> rebuilding the table ordered by the transaction_date column, but it may screw up
> other range scans.

Date is almost always a criteria in scans of this table.  As mentioned
earlier, the table is naturally built in date order - so would
rebuilding the table help?  Is it possible that even though I'm
inserting in date order, since I delete rows so often the physical
correlation would get disrupted as disk pages are reused?  Perhaps
clustering on the transaction_date index and periodically running
"cluster" would help?  Does vacuum full help with this at all?

> Another option is partitioning.  I wouldn't do any of this
> stuff, until you find out the last tweak you made still holds true, give
> it a few days, perhaps test it after a clean reboot of the server.

Yeah - partitioning makes a lot of sense and I've thought about doing
this in the past.  Although I do run queries that cross multiple days,
most of my queries only look at today's data, so the physical disk
organization would likely be much better with a partitioned table setup.
 Also, since I usually delete old data one day at a time, I could simply
drop the old day's partition.  This would make vacuuming much less of an
issue.

But I won't be making any changes immediately, so I'll continue to run
tests given your advice.

Thanks again,
Jeremy Haile

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Dave Cramer

Date:

17 January 2007, 11:14:44

On 17-Jan-07, at 9:37 AM, Jeremy Haile wrote:

>> I still keep wondering if this table is bloated with dead tuples.
>> Even
>> if you vacuum often if there's a connection with an idle transaction,
>> the tuples can't be reclaimed and the table would continue to grow.
>
> I used to vacuum once an hour, although I've switched it to autovacuum
> now.  It definitely could be bloated with dead tuples.  I'll paste the
> "vacuum analyze verbose" output at the bottom of this e-mail.  Would a
> vacuum full be a good idea?
>
>
>> Assuming the table's NOT bloated, you may do well to increase the
>> effective_cache_size, which doesn't allocate anything,
> <snip>
>> try setting it to something like 512MB or so.
>
> It's currently set to 1000MB.

How much memory does the box have

>
>
>> If your table is bloating, and you don't have idle transactions
>> hanging
>> of the database, it could be that your fsm settings are too low.
>
> fsm is currently set to 2000000.  Is there any harm in setting it too
> high? =)

Yes, it takes up space
>
> Here's the vacuum analyze verbose output:
>
> INFO:  vacuuming "public.transaction_facts"
> INFO:  scanned index "transaction_facts_pkey" to remove 759969 row
> versions
> DETAIL:  CPU 7.20s/2.31u sec elapsed 315.31 sec.
> INFO:  scanned index "transaction_facts_dcms_dim_id_idx" to remove
> 759969 row versions
> DETAIL:  CPU 1.29s/2.15u sec elapsed 146.98 sec.
> INFO:  scanned index "transaction_facts_merchant_dim_id_idx" to remove
> 759969 row versions
> DETAIL:  CPU 1.10s/2.10u sec elapsed 126.09 sec.
> INFO:  scanned index "transaction_facts_transaction_date_idx" to
> remove
> 759969 row versions
> DETAIL:  CPU 1.65s/2.40u sec elapsed 259.25 sec.
> INFO:  scanned index "transaction_facts_transaction_id_idx" to remove
> 759969 row versions
> DETAIL:  CPU 7.48s/2.85u sec elapsed 371.98 sec.
> INFO:  scanned index "transaction_facts_product_date_idx" to remove
> 759969 row versions
> DETAIL:  CPU 2.32s/2.10u sec elapsed 303.83 sec.
> INFO:  scanned index "transaction_facts_merchant_product_date_idx" to
> remove 759969 row versions
> DETAIL:  CPU 2.48s/2.31u sec elapsed 295.19 sec.
> INFO:  scanned index "transaction_facts_merchant_date_idx" to remove
> 759969 row versions
> DETAIL:  CPU 8.10s/3.35u sec elapsed 398.73 sec.
> INFO:  scanned index "transaction_facts_success_idx" to remove 759969
> row versions
> DETAIL:  CPU 5.01s/2.84u sec elapsed 192.73 sec.
> INFO:  scanned index "transaction_facts_failed_idx" to remove
> 759969 row
> versions
> DETAIL:  CPU 1.03s/1.90u sec elapsed 123.00 sec.
> INFO:  scanned index "transaction_facts_archived_idx" to remove 759969
> row versions
> DETAIL:  CPU 1.03s/1.39u sec elapsed 104.42 sec.
> INFO:  scanned index "transaction_facts_response_code_idx" to remove
> 759969 row versions
> DETAIL:  CPU 0.75s/2.17u sec elapsed 36.71 sec.
> INFO:  scanned index "transaction_facts_transaction_source_idx" to
> remove 759969 row versions
> DETAIL:  CPU 0.60s/1.75u sec elapsed 42.29 sec.
> INFO:  scanned index "transaction_facts_transaction_id_source_idx" to
> remove 759969 row versions
> DETAIL:  CPU 1.14s/1.84u sec elapsed 44.75 sec.
> INFO:  "transaction_facts": removed 759969 row versions in 14360 pages
> DETAIL:  CPU 0.57s/0.23u sec elapsed 45.28 sec.
> INFO:  index "transaction_facts_pkey" now contains 2274280 row
> versions
> in 152872 pages
> DETAIL:  759969 index row versions were removed.
> 134813 index pages have been deleted, 134813 are currently reusable.
> CPU 0.00s/0.01u sec elapsed 0.01 sec.
> INFO:  index "transaction_facts_dcms_dim_id_idx" now contains 2274280
> row versions in 85725 pages
> DETAIL:  759323 index row versions were removed.
> 75705 index pages have been deleted, 73721 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_merchant_dim_id_idx" now contains
> 2274280 row versions in 80023 pages
> DETAIL:  759969 index row versions were removed.
> 71588 index pages have been deleted, 69210 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_transaction_date_idx" now contains
> 2274280 row versions in 144196 pages
> DETAIL:  759969 index row versions were removed.
> 126451 index pages have been deleted, 126451 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_transaction_id_idx" now contains
> 2274280
> row versions in 150529 pages
> DETAIL:  759969 index row versions were removed.
> 130649 index pages have been deleted, 130649 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_product_date_idx" now contains 2274280
> row versions in 202248 pages
> DETAIL:  759969 index row versions were removed.
> 174652 index pages have been deleted, 174652 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_merchant_product_date_idx" now
> contains
> 2274280 row versions in 202997 pages
> DETAIL:  759969 index row versions were removed.
> 175398 index pages have been deleted, 175398 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_merchant_date_idx" now contains
> 2274280
> row versions in 203561 pages
> DETAIL:  759969 index row versions were removed.
> 175960 index pages have been deleted, 175960 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_success_idx" now contains 2274280 row
> versions in 78237 pages
> DETAIL:  759969 index row versions were removed.
> 70239 index pages have been deleted, 67652 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_failed_idx" now contains 2274280 row
> versions in 78230 pages
> DETAIL:  759969 index row versions were removed.
> 70231 index pages have been deleted, 67665 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_archived_idx" now contains 2274280 row
> versions in 72943 pages
> DETAIL:  759969 index row versions were removed.
> 64962 index pages have been deleted, 62363 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_response_code_idx" now contains
> 2274280
> row versions in 16918 pages
> DETAIL:  759969 index row versions were removed.
> 8898 index pages have been deleted, 6314 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_transaction_source_idx" now contains
> 2274280 row versions in 14235 pages
> DETAIL:  759969 index row versions were removed.
> 6234 index pages have been deleted, 3663 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  index "transaction_facts_transaction_id_source_idx" now
> contains
> 2274280 row versions in 18053 pages
> DETAIL:  759969 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "transaction_facts": found 759969 removable, 2274280
> nonremovable
> row versions in 308142 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 15710471 unused item pointers.
> 266986 pages contain useful free space.
> 0 pages are entirely empty.
> CPU 58.00s/35.59u sec elapsed 3240.94 sec.
> INFO:  analyzing "public.transaction_facts"
> INFO:  "transaction_facts": scanned 15000 of 308142 pages, containing
> 113476 live rows and 0 dead rows; 15000 rows in sample, 2331115
> estimated total rows
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 11:23:59

> How much memory does the box have
2GB

> Yes, it takes up space
Well, I upped max_fsm_pages to 2000000 because it vacuums were failing
with it set to 1500000.  However, I'm now autovacuuming, which might be
keeping my fsm lower.  I didn't realize that setting it too high had
negative effects, so I'll try to get a better handle on how large this
needs to be.

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Scott Marlowe

Date:

17 January 2007, 12:19:46

On Wed, 2007-01-17 at 08:37, Jeremy Haile wrote:
> > I still keep wondering if this table is bloated with dead tuples.  Even
> > if you vacuum often if there's a connection with an idle transaction,
> > the tuples can't be reclaimed and the table would continue to grow.
>
> I used to vacuum once an hour, although I've switched it to autovacuum
> now.  It definitely could be bloated with dead tuples.  I'll paste the
> "vacuum analyze verbose" output at the bottom of this e-mail.  Would a
> vacuum full be a good idea?
>
>
> > Assuming the table's NOT bloated, you may do well to increase the
> > effective_cache_size, which doesn't allocate anything,
> <snip>
> > try setting it to something like 512MB or so.
>
> It's currently set to 1000MB.

Sounds about right for a machine with 2G memory (as you mentioned
elsewhere)

> > If your table is bloating, and you don't have idle transactions hanging
> > of the database, it could be that your fsm settings are too low.
>
> fsm is currently set to 2000000.  Is there any harm in setting it too
> high? =)

As long as you don't run out of memory, it can be pretty high.  note
that it uses shared memory, so setting it too high can cause the db to
fail to start.

> INFO:  scanned index "transaction_facts_pkey" to remove 759969 row
> versions
> DETAIL:  CPU 7.20s/2.31u sec elapsed 315.31 sec.
> INFO:  scanned index "transaction_facts_dcms_dim_id_idx" to remove
> 759969 row versions

SNIP

> DETAIL:  759969 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "transaction_facts": found 759969 removable, 2274280 nonremovable
> row versions in 308142 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 15710471 unused item pointers.
> 266986 pages contain useful free space.
> 0 pages are entirely empty.
> CPU 58.00s/35.59u sec elapsed 3240.94 sec.

That's about 32% dead rows.  Might be worth scheduling a vacuum full,
but it's not like I was afraid it might be.  It looks to me like you
could probably use a faster I/O subsystem in that machine though.

If the random page cost being lower fixes your issues, then I'd just run
with it lower for now.  note that while lowering it may fix one query,
it may break another.  Tuning pgsql, like any database, is as much art
as science...

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 12:28:17

> That's about 32% dead rows.  Might be worth scheduling a vacuum full,
> but it's not like I was afraid it might be.  It looks to me like you
> could probably use a faster I/O subsystem in that machine though.

I'll try to schedule a full vacuum tonight.  As far as I/O - it's using
SAN over fiber.  Not as fast as internal SCSI though...

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Chad Wagner"

Date:

17 January 2007, 13:27:14

On 1/17/07, Jeremy Haile <jhaile@fastmail.fm> wrote:

Maybe - I tried running the same query for an older time range that is
less likely to be cached. The index scan took longer than my previous
example, but still only took 16 seconds, compared to the 87 seconds
required to seqscan the table. When I can, I'll restart the machine and
run a comparison again to get a "pure" test.

Heh, querying a different range of data was a better idea compared to rebooting.. doh.. So I think you reasonably established that an index scan for unbuffered data would still be faster than a sequential scan.

To me this is one of those cases where the optimizer doesn't understand the clustering of the data, and it is being misled by the statistics and fixed parameters it has. If you have fast disks (I think a fiber SAN probably counts here) then adjusting random_page_cost lower is reasonable, the lowest I have heard recommended is 2.0. It would be nice if the database could learn to estimate these values, as newer versions of Oracle does.

Date is almost always a criteria in scans of this table.  As mentioned
earlier, the table is naturally built in date order - so would
rebuilding the table help?  Is it possible that even though I'm
inserting in date order, since I delete rows so often the physical
correlation would get disrupted as disk pages are reused?  Perhaps
clustering on the transaction_date index and periodically running
"cluster" would help?  Does vacuum full help with this at all?

Yes, cluster would rebuild the table for you. I wouldn't do anything too intrusive, run with the random_page_cost lowered, perhaps vacuum full, reindex, and see what happens. If it degrades over time, then I would start looking at partitioning or some other solution.

Yeah - partitioning makes a lot of sense and I've thought about doing
this in the past. Although I do run queries that cross multiple days,
most of my queries only look at today's data, so the physical disk
organization would likely be much better with a partitioned table setup.
Also, since I usually delete old data one day at a time, I could simply
drop the old day's partition. This would make vacuuming much less of an
issue.

Yep, my thoughts exactly. Partitioning support is PostgreSQL is there, but it needs a bit more of a tighter integration into the core (I shouldn't have to create a view, n tables, n rules, etc). Additionally, I have read that at some point when you have "y" partitions the performance degrades, haven't really looked into it myself.

--
Chad
http://www.postgresqlforums.com/

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Scott Marlowe

Date:

17 January 2007, 14:35:41

On Wed, 2007-01-17 at 10:28, Jeremy Haile wrote:
> > That's about 32% dead rows.  Might be worth scheduling a vacuum full,
> > but it's not like I was afraid it might be.  It looks to me like you
> > could probably use a faster I/O subsystem in that machine though.
>
> I'll try to schedule a full vacuum tonight.  As far as I/O - it's using
> SAN over fiber.  Not as fast as internal SCSI though...

Also, look at the thread going by about index bloat by 4x.  You'll
likely want to reindex after a vacuum full since vacuum full doesn't
reclaim space in indexes and in fact often bloats indexes.

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 14:47:38

> It would be nice if the database could
> learn to estimate these values, as newer versions of Oracle does.

That would be really nice since it would take some of the guess work out
of it.

> Yes, cluster would rebuild the table for you.  I wouldn't do anything too
> intrusive, run with the random_page_cost lowered, perhaps vacuum full,
> reindex, and see what happens.

I'll try doing the vacuum full and reindex tonight since they require
exclusive locks.

> Yep, my thoughts exactly.  Partitioning support is PostgreSQL is there,
> but  it needs a bit more of a tighter integration into the core (I shouldn't
> have to create a view, n tables, n rules, etc).  Additionally, I have read
> that at some point when you have "y" partitions the performance degrades,
> haven't really looked into it myself.

Yeah - I haven't setup partitioning in PostgreSQL before, although I've
read quite a bit about it.  I've talked about getting improved syntax
for partitioning in PostgreSQL.  MySQL's syntax is much simpler and more
intuitive compared to setting them up with Postgres - it would be nice
if PostgreSQL adopted a similar syntax where partitions were first-class
citizens.

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 14:55:33

> Also, look at the thread going by about index bloat by 4x.  You'll
> likely want to reindex after a vacuum full since vacuum full doesn't
> reclaim space in indexes and in fact often bloats indexes.

Thanks for the pointer.  That thread might indeed apply to my situation.
 I'm going to reindex the the table tonight.

Jeremy Haile

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Tomas Vondra

Date:

17 January 2007, 15:23:36

>> Assuming the table's NOT bloated, you may do well to increase the
>> effective_cache_size, which doesn't allocate anything,
> <snip>
>> try setting it to something like 512MB or so.
>
> It's currently set to 1000MB.
>
>
>> If your table is bloating, and you don't have idle transactions hanging
>> of the database, it could be that your fsm settings are too low.
>
> fsm is currently set to 2000000.  Is there any harm in setting it too
> high? =)

I generally recomend to use this - it's a nice list of the most
important settings in postgresql.conf (with respect to performance),
along with a short explanation, and suggested values:

   http://www.powerpostgresql.com/PerfList

I'm using it as a general guide when setting and tuning our servers.

Anyway, as someone already pointed out, it's an art to choose the proper
values - there's nothing like 'the only best values'.

tomas

Re: PG8.2.1 choosing slow seqscan over idx scan

From

Tomas Vondra

Date:

17 January 2007, 15:35:44

> That's about 32% dead rows.  Might be worth scheduling a vacuum full,
> but it's not like I was afraid it might be.  It looks to me like you
> could probably use a faster I/O subsystem in that machine though.
>
> If the random page cost being lower fixes your issues, then I'd just run
> with it lower for now.  note that while lowering it may fix one query,
> it may break another.  Tuning pgsql, like any database, is as much art
> as science...

A nice feature of postgresql is the ability to log the 'slow queries'
(exceeding some time limit) - you can use it to compare the performance
of various settings. We're using it to 'detect' stupid SQL etc.

Just set it reasonably (the value depends on you), for example we used
about 500ms originally and after about two months of improvements we
lowered it to about 100ms.

You can analyze the log by hand, but about a year ago I've written a
tool to parse it and build a set of HTML reports with an overview and
details about each query) along with graphs and examples of queries.

You can get it here: http://opensource.pearshealthcyber.cz/

Just beware, it's written in PHP and it definitely is not perfect:

   (1) memory requirements (about 4x the size of the log)
   (2) not to fast (about 20mins of P4@3GHz for a 200MB log)
   (3) it requires a certain log format (see the page)

I did some improvements to the script recently, but forgot to upload it.
I'll do that tomorrow.

Tomas

Re: PG8.2.1 choosing slow seqscan over idx scan

From

"Jeremy Haile"

Date:

17 January 2007, 15:39:42

Interesting - I haven't seen that tool before.  I'll have to check it
out when I get a chance.  Thanks!


On Wed, 17 Jan 2007 20:32:37 +0100, "Tomas Vondra" <tv@fuzzy.cz> said:
> > That's about 32% dead rows.  Might be worth scheduling a vacuum full,
> > but it's not like I was afraid it might be.  It looks to me like you
> > could probably use a faster I/O subsystem in that machine though.
> >
> > If the random page cost being lower fixes your issues, then I'd just run
> > with it lower for now.  note that while lowering it may fix one query,
> > it may break another.  Tuning pgsql, like any database, is as much art
> > as science...
>
> A nice feature of postgresql is the ability to log the 'slow queries'
> (exceeding some time limit) - you can use it to compare the performance
> of various settings. We're using it to 'detect' stupid SQL etc.
>
> Just set it reasonably (the value depends on you), for example we used
> about 500ms originally and after about two months of improvements we
> lowered it to about 100ms.
>
> You can analyze the log by hand, but about a year ago I've written a
> tool to parse it and build a set of HTML reports with an overview and
> details about each query) along with graphs and examples of queries.
>
> You can get it here: http://opensource.pearshealthcyber.cz/
>
> Just beware, it's written in PHP and it definitely is not perfect:
>
>    (1) memory requirements (about 4x the size of the log)
>    (2) not to fast (about 20mins of P4@3GHz for a 200MB log)
>    (3) it requires a certain log format (see the page)
>
> I did some improvements to the script recently, but forgot to upload it.
> I'll do that tomorrow.
>
> Tomas
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

Configuration Advice

From

Steve

Date:

17 January 2007, 16:59:53

Hey there;

I've been lurking on this list awhile, and I've been working with postgres
for a number of years so I'm not exactly new to this.  But I'm still
having trouble getting a good balance of settings and I'd like to see what
other people think.  We may also be willing to hire a contractor to help
tackle this problem if anyone is interested.

I've got an application here that runs large (in terms of length -- the
queries have a lot of conditions in them) queries that can potentially
return millions of rows but on average probably return tens of thousands
of rows.  It's read only for most of the day, and pretty much all the
queries except one are really fast.

However, each night we load data from a legacy cobol system into the SQL
system and then we summarize that data to make the reports faster.  This
load process is intensely insert/update driven but also has a hefty
amount of selects as well.  This load process is taking ever longer to
complete.


SO ... our goal here is to make this load process take less time.  It
seems the big part is building the big summary table; this big summary
table is currently 9 million rows big.  Every night, we drop the table,
re-create it, build the 9 million rows of data (we use COPY to put hte
data in when it's prepared, not INSERT), and then build the indexes on it
-- of which there are many.  Unfortunately this table gets queried
in a lot of different ways and needs these indexes; also unfortunately, we
have operator class indexes to support both ASC and DESC sorting on
columns so these are for all intents and purposes duplicate but required
under Postgres 8.1 (we've recently upgraded to Postgres 8.2, is this still
a requirement?)

Building these indexes takes forever!  It's a long grind through inserts
and then building the indexes takes a hefty amount of time too.  (about 9
hours).  Now, the application is likely part at fault, and we're working
to make it more efficient, but it has nothing to do with the index
building time.  I'm wondering what we can do to make this better if
anything; would it be better to leave the indexes on?  It doesn't seem to
be.  Would it be better to use INSERTs instead of copies?  Doesn't seem to
be.


Anyway -- ANYTHING we can do to make this go faster is appreciated :)
Here's some vital statistics:

- Machine is a 16 GB, 4 actual CPU dual-core opteron system using SCSI
discs.  The disc configuration seems to be a good one, it's the best of
all the ones we've tested so far.

- The load process itself takes about 6 gigs of memory, the rest is free
for postgres because this is basically all the machine does.

- If this was your machine and situation, how would you lay out the emmory
settings?  What would you set the FSM to?  Would you leave teh bgwriter on
or off?  We've already got FSYNC off because "data integrity" doesn't
matter -- this stuff is religeously backed up and we've got no problem
reinstalling it.  Besides, in order for this machine to go down, data
integrity of the DB is the least of the worries :)

Do wal_buffers/full_page_writes matter of FSYNC is off?  If so, what
settings?  What about checkpoints?

Any finally, any ideas on planner constants?  Here's what I'm using:

seq_page_cost = 0.5                     # measured on an arbitrary scale
random_page_cost = 1.0                  # same scale as above
cpu_tuple_cost = 0.001                  # same scale as above
cpu_index_tuple_cost = 0.0001           # same scale as above
cpu_operator_cost = 0.00025             # same scale as above
effective_cache_size = 679006

I really don't remember how I came up with that effective_cache_size
number....


Anyway... any advice would be appreciated :)


Steve

Thread: PG8.2.1 choosing slow seqscan over idx scan

Attachment