Thread: Optimizer Question/Suggestion

Optimizer Question/Suggestion

From
Philip Warner
Date:
The optimizer seems to know about dead rows in tables (ie. it will use an 
index it would not ordinarily use if vacuum-full had been run, apparently 
because it knows the table has many dead rows, and only a few valid ones.

I was wondering if there would any value in letting the optimizer replace a 
sequential scan with a dummy index scan (eg. on PK) in cases where it knew 
that the table was full of dead rows.

This comes about because we have a table with 800 rows, one more more of 
which are updated every second of most days. The result in normal usage is 
that the table contains about 10000 tuples one hour after vacuuming. Also, 
the databases tries to be 24x7, and the table concerned is a core table, so 
vacuum/full once per hour is not an option.

To give some numbers:

mail=# explain select * from often_updated;
Seq Scan on often_updated (cost=0.00..49273.50 rows=750 width=205)

mail=# explain select * from often_updated where id between '-10000' and 
'10000';
Index Scan using often_updated_id on often_updated (cost=0.00..3041.80 
rows=741 width=205)

(the IDs range from 0 to about 1200).

Creating a table by selecting rows from the first table, defining an index 
then analyzing results in:

mail=# explain select * from bu where id between '-10000' and '10000';
Seq Scan on bu  (cost=0.00..33.25 rows=741 width=205)

...which is perfectly reasonable.

ISTM that if a table has a PK, then a bogus index scan should be introduced 
if a table has more than a 'RandomPageCost/SequentialPageCost' ratio of 
dead:live tuples. Or we should always add a PK scan into the list of 
strategies considered.



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/



Re: Optimizer Question/Suggestion - numbers after

From
Philip Warner
Date:
At 07:39 PM 2/11/2002 +1100, Philip Warner wrote:
>To give some numbers:

And some more numbers, directly after a vacuum and analyze:

mail=# explain analyze select * from often_updated where id between 
'-10000' and '100000';
Index Scan using barnet_users_id on often_updated  (cost=0.00..3095.66 
rows=750 width=205) (actual time=0.15..41.04 rows=750 loops=1)
Total runtime: 44.81 msec

mail=# explain analyze select * from often_updated;
Seq Scan on often_updated  (cost=0.00..49273.50 rows=750 width=205) (actual 
time=1.93..1710.01 rows=750 loops=1)
Total runtime: 1714.32 msec

The latter time is actually quote good; when the machine is more heavily 
loaded it goes up to 10000ms.

We currently vacuum/analyze daily, and analyze hourly.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/



Re: Optimizer Question/Suggestion - numbers after

From
Doug McNaught
Date:
Philip Warner <pjw@rhyme.com.au> writes:

> At 07:39 PM 2/11/2002 +1100, Philip Warner wrote:

> The latter time is actually quote good; when the machine is more
> heavily loaded it goes up to 10000ms.
> 
> We currently vacuum/analyze daily, and analyze hourly.

Why not vacuum hourly (regular non-blocking vacuum, not FULL)?

-Doug


Re: Optimizer Question/Suggestion

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> This comes about because we have a table with 800 rows, one more more of 
> which are updated every second of most days. The result in normal usage is 
> that the table contains about 10000 tuples one hour after vacuuming. Also, 
> the databases tries to be 24x7, and the table concerned is a core table, so 
> vacuum/full once per hour is not an option.

Why not do frequent non-full vacuums on only that table, perhaps every
five minutes or so?  That's certainly the direction that development is
headed in (we just haven't automated the vacuuming yet).

> ISTM that if a table has a PK, then a bogus index scan should be introduced 
> if a table has more than a 'RandomPageCost/SequentialPageCost' ratio of 
> dead:live tuples.

The ratio would have to be higher than that, because ordinarily you
expect to get more than one tuple per sequential page read.  But I think
this is going in the wrong direction anyway.  Ideally we should never
let a table get so overloaded with dead space that this strategy would
be profitable.

BTW, the system does not actually have any stats about dead tuples.
What it knows about are live tuples and total disk pages occupied by
the table.
        regards, tom lane


Re: Optimizer Question/Suggestion

From
Philip Warner
Date:
At 09:36 AM 2/11/2002 -0500, Tom Lane wrote:

>Why not do frequent non-full vacuums on only that table, perhaps every
>five minutes or so?  That's certainly the direction that development is
>headed in (we just haven't automated the vacuuming yet).

Done this now, and I'll wait for a new high load time to see how big the 
table gets.

Definitely looking forward integrated on-line vacuum!


>Ideally we should never let a table get so overloaded with dead space
>that this strategy would be profitable.

I suspect it would be more common that you might hope, both because of 
incompetance/changed database usage (as in this case) and archival 
strategies (ie. deleting data periodically, but *not* doing a full vacuum). 
I come from a background where pre-allocating unused space for table data 
is a good strategy, not a performance killer, and I'm probably not alone.

If it was not hard, I thought adding a PK scan as a possible strategy when 
considering seqscan was an interesting option. I suppose the other option 
in this case would be to modify seqscan to only look at pages we know have 
records (if we keep that data?).


>BTW, the system does not actually have any stats about dead tuples.
>What it knows about are live tuples and total disk pages occupied by
>the table.

So what made it choose the index scan? Does it make guesses about tuple 
sizes, and predict empty space?





----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/



Re: Optimizer Question/Suggestion

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> At 09:36 AM 2/11/2002 -0500, Tom Lane wrote:
>> BTW, the system does not actually have any stats about dead tuples.
>> What it knows about are live tuples and total disk pages occupied by
>> the table.

> So what made it choose the index scan?

Well, the main component of the seqscan cost estimate is the total
number of disk pages, while the indexscan cost estimate is driven
by the number of tuples expected to be retrieved.
        regards, tom lane


Re: Optimizer Question/Suggestion

From
Hannu Krosing
Date:
Philip Warner kirjutas P, 03.11.2002 kell 06:30:
> At 09:36 AM 2/11/2002 -0500, Tom Lane wrote:
> 
> >Why not do frequent non-full vacuums on only that table, perhaps every
> >five minutes or so?  That's certainly the direction that development is
> >headed in (we just haven't automated the vacuuming yet).
> 
> Done this now, and I'll wait for a new high load time to see how big the 
> table gets.

You should find the best interval by testing.

I guess the interval could be smaller than 5 min for high loads - I
remember testing this situation for getting top update performance when
several threads were doing updates at full speed and the best
performance was achieved by running a separate backend in a loop that
kept doing VACUUM TABLE with only 5 seconds sleep between .

-----------------
Hannu




Re: Optimizer Question/Suggestion

From
Philip Warner
Date:
At 03:25 PM 3/11/2002 +0500, Hannu Krosing wrote:
>a separate backend in a loop that
>kept doing VACUUM TABLE with only 5 seconds sleep between

Good grief! I thought 5 minutes was bad enough. Can't wait for b/g vacuum. 
Thanks for the input; I'll wait for a day or so to get some figures as you 
suggest.



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/



Re: Optimizer Question/Suggestion

From
Hannu Krosing
Date:
Philip Warner kirjutas P, 03.11.2002 kell 15:41:
> At 03:25 PM 3/11/2002 +0500, Hannu Krosing wrote:
> >a separate backend in a loop that
> >kept doing VACUUM TABLE with only 5 seconds sleep between
> 
> Good grief! I thought 5 minutes was bad enough. Can't wait for b/g vacuum. 
> Thanks for the input; I'll wait for a day or so to get some figures as you 
> suggest.

The 5 sec number was for case when tens of worker threads were updating
as fast as they could a table with just a few of hundreds of rows. I
guess your case is not _that_ intensive, so you can probably use much
bigger intervals.

-------
Hannu