Thread: Optimizer Question/Suggestion
The optimizer seems to know about dead rows in tables (ie. it will use an index it would not ordinarily use if vacuum-full had been run, apparently because it knows the table has many dead rows, and only a few valid ones. I was wondering if there would any value in letting the optimizer replace a sequential scan with a dummy index scan (eg. on PK) in cases where it knew that the table was full of dead rows. This comes about because we have a table with 800 rows, one more more of which are updated every second of most days. The result in normal usage is that the table contains about 10000 tuples one hour after vacuuming. Also, the databases tries to be 24x7, and the table concerned is a core table, so vacuum/full once per hour is not an option. To give some numbers: mail=# explain select * from often_updated; Seq Scan on often_updated (cost=0.00..49273.50 rows=750 width=205) mail=# explain select * from often_updated where id between '-10000' and '10000'; Index Scan using often_updated_id on often_updated (cost=0.00..3041.80 rows=741 width=205) (the IDs range from 0 to about 1200). Creating a table by selecting rows from the first table, defining an index then analyzing results in: mail=# explain select * from bu where id between '-10000' and '10000'; Seq Scan on bu (cost=0.00..33.25 rows=741 width=205) ...which is perfectly reasonable. ISTM that if a table has a PK, then a bogus index scan should be introduced if a table has more than a 'RandomPageCost/SequentialPageCost' ratio of dead:live tuples. Or we should always add a PK scan into the list of strategies considered. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
At 07:39 PM 2/11/2002 +1100, Philip Warner wrote: >To give some numbers: And some more numbers, directly after a vacuum and analyze: mail=# explain analyze select * from often_updated where id between '-10000' and '100000'; Index Scan using barnet_users_id on often_updated (cost=0.00..3095.66 rows=750 width=205) (actual time=0.15..41.04 rows=750 loops=1) Total runtime: 44.81 msec mail=# explain analyze select * from often_updated; Seq Scan on often_updated (cost=0.00..49273.50 rows=750 width=205) (actual time=1.93..1710.01 rows=750 loops=1) Total runtime: 1714.32 msec The latter time is actually quote good; when the machine is more heavily loaded it goes up to 10000ms. We currently vacuum/analyze daily, and analyze hourly. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > At 07:39 PM 2/11/2002 +1100, Philip Warner wrote: > The latter time is actually quote good; when the machine is more > heavily loaded it goes up to 10000ms. > > We currently vacuum/analyze daily, and analyze hourly. Why not vacuum hourly (regular non-blocking vacuum, not FULL)? -Doug
Philip Warner <pjw@rhyme.com.au> writes: > This comes about because we have a table with 800 rows, one more more of > which are updated every second of most days. The result in normal usage is > that the table contains about 10000 tuples one hour after vacuuming. Also, > the databases tries to be 24x7, and the table concerned is a core table, so > vacuum/full once per hour is not an option. Why not do frequent non-full vacuums on only that table, perhaps every five minutes or so? That's certainly the direction that development is headed in (we just haven't automated the vacuuming yet). > ISTM that if a table has a PK, then a bogus index scan should be introduced > if a table has more than a 'RandomPageCost/SequentialPageCost' ratio of > dead:live tuples. The ratio would have to be higher than that, because ordinarily you expect to get more than one tuple per sequential page read. But I think this is going in the wrong direction anyway. Ideally we should never let a table get so overloaded with dead space that this strategy would be profitable. BTW, the system does not actually have any stats about dead tuples. What it knows about are live tuples and total disk pages occupied by the table. regards, tom lane
At 09:36 AM 2/11/2002 -0500, Tom Lane wrote: >Why not do frequent non-full vacuums on only that table, perhaps every >five minutes or so? That's certainly the direction that development is >headed in (we just haven't automated the vacuuming yet). Done this now, and I'll wait for a new high load time to see how big the table gets. Definitely looking forward integrated on-line vacuum! >Ideally we should never let a table get so overloaded with dead space >that this strategy would be profitable. I suspect it would be more common that you might hope, both because of incompetance/changed database usage (as in this case) and archival strategies (ie. deleting data periodically, but *not* doing a full vacuum). I come from a background where pre-allocating unused space for table data is a good strategy, not a performance killer, and I'm probably not alone. If it was not hard, I thought adding a PK scan as a possible strategy when considering seqscan was an interesting option. I suppose the other option in this case would be to modify seqscan to only look at pages we know have records (if we keep that data?). >BTW, the system does not actually have any stats about dead tuples. >What it knows about are live tuples and total disk pages occupied by >the table. So what made it choose the index scan? Does it make guesses about tuple sizes, and predict empty space? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > At 09:36 AM 2/11/2002 -0500, Tom Lane wrote: >> BTW, the system does not actually have any stats about dead tuples. >> What it knows about are live tuples and total disk pages occupied by >> the table. > So what made it choose the index scan? Well, the main component of the seqscan cost estimate is the total number of disk pages, while the indexscan cost estimate is driven by the number of tuples expected to be retrieved. regards, tom lane
Philip Warner kirjutas P, 03.11.2002 kell 06:30: > At 09:36 AM 2/11/2002 -0500, Tom Lane wrote: > > >Why not do frequent non-full vacuums on only that table, perhaps every > >five minutes or so? That's certainly the direction that development is > >headed in (we just haven't automated the vacuuming yet). > > Done this now, and I'll wait for a new high load time to see how big the > table gets. You should find the best interval by testing. I guess the interval could be smaller than 5 min for high loads - I remember testing this situation for getting top update performance when several threads were doing updates at full speed and the best performance was achieved by running a separate backend in a loop that kept doing VACUUM TABLE with only 5 seconds sleep between . ----------------- Hannu
At 03:25 PM 3/11/2002 +0500, Hannu Krosing wrote: >a separate backend in a loop that >kept doing VACUUM TABLE with only 5 seconds sleep between Good grief! I thought 5 minutes was bad enough. Can't wait for b/g vacuum. Thanks for the input; I'll wait for a day or so to get some figures as you suggest. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner kirjutas P, 03.11.2002 kell 15:41: > At 03:25 PM 3/11/2002 +0500, Hannu Krosing wrote: > >a separate backend in a loop that > >kept doing VACUUM TABLE with only 5 seconds sleep between > > Good grief! I thought 5 minutes was bad enough. Can't wait for b/g vacuum. > Thanks for the input; I'll wait for a day or so to get some figures as you > suggest. The 5 sec number was for case when tens of worker threads were updating as fast as they could a table with just a few of hundreds of rows. I guess your case is not _that_ intensive, so you can probably use much bigger intervals. ------- Hannu