Thread: Re: [COMMITTERS] pgsql: Reduce the size of memory allocations by lazy vacuum when
Re: [COMMITTERS] pgsql: Reduce the size of memory allocations by lazy vacuum when
From
"Heikki Linnakangas"
Date:
Alvaro Herrera wrote: > Log Message: > ----------- > Reduce the size of memory allocations by lazy vacuum when processing a small > table, by allocating just enough for a hardcoded number of dead tuples per > page. The current estimate is 200 dead tuples per page. 200 sounds like a badly chosen value. With a 8k block size, that's a bit less than MaxHeapTuplesPerPage, which means that in the worst case you don't allocate enough space to hold all dead tuples, and you end up doing 2 index cleanups, no matter how large you set maintenance_work_mem. Remember that having MaxHeapTuplesPerPage dead tuples on a page just got much more likely with HOT, and with larger block sizes 200 tuples isn't very much anyway. At the other end of the spectrum, with a smaller block size 200 is more than MaxHeapTuplesPerPage, so we're still allocating more than necessary. Note that as the patch stands, the capping is not limited to small tables. Doing extra index passes on a relatively big table with lots of indexes might be cause a lot of real extra I/O. How about just using MaxHeapTuplesPerPage? With the default 8K block size, it's not that much more than 200, but makes the above gripes completely go away. That seems like the safest option at this point. > Per reports from Jeff Amiel, Erik Jones and Marko Kreen, and subsequent > discussion. Ok, I just read that discussion in the archives. A lot of good ideas were suggested, like reducing the space required for the tid list, or dividing maintenance_work_mem between workers. None of that is going to happen for 8.3, so it seems likely that we're going to revisit this in 8.4. Let's keep it simple and safe for now. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Reduce the size of memory allocations by lazy vacuum when
From
Simon Riggs
Date:
On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote: > Alvaro Herrera wrote: > > Log Message: > > ----------- > > Reduce the size of memory allocations by lazy vacuum when processing a small > > table, by allocating just enough for a hardcoded number of dead tuples per > > page. The current estimate is 200 dead tuples per page. > > 200 sounds like a badly chosen value. With a 8k block size, that's a bit > less than MaxHeapTuplesPerPage, which means that in the worst case you > don't allocate enough space to hold all dead tuples, and you end up > doing 2 index cleanups, no matter how large you set > maintenance_work_mem. Agreed. Tables with 2-4 columns easily fall into that category. Assoication tables are often like this and can have 2 indexes on them. > Note that as the patch stands, the capping is not limited to small > tables. Doing extra index passes on a relatively big table with lots of > indexes might be cause a lot of real extra I/O. > > How about just using MaxHeapTuplesPerPage? With the default 8K block > size, it's not that much more than 200, but makes the above gripes > completely go away. That seems like the safest option at this point. It would be much better to use a value for each table. Any constant value will be sub-optimal in many cases. Let's use our knowledge of the table to calculate a sensible value. We often have average row length available from last VACUUM, don't we? Use that, plus 10%. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Re: [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when
From
"Heikki Linnakangas"
Date:
Simon Riggs wrote: > On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote: >> How about just using MaxHeapTuplesPerPage? With the default 8K block >> size, it's not that much more than 200, but makes the above gripes >> completely go away. That seems like the safest option at this point. > > It would be much better to use a value for each table. Any constant > value will be sub-optimal in many cases. Allocating extra memory doesn't usually do much harm, as long as you don't actually use it. The reason we're now limiting it is to avoid Out Of Memory errors if you're running with overcommit turned off, and autovacuum triggers a vacuum on multiple tables at the same time. Let's keep it simple. Per-table setting would be much more complex and would be something that the DBA would need to calculate and set. If you really do run into this problem, you can just dial down maintenance_work_mem. > Let's use our knowledge of the table to calculate a sensible value. We > often have average row length available from last VACUUM, don't we? Use > that, plus 10%. If there's dead line pointers in the table, left behind by HOT pruning, the average row length would be completely bogus. Using the dead_tuples stats directly would be more reasonable, but I would like us to keep this even more conservative than that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when
From
Simon Riggs
Date:
On Mon, 2007-09-24 at 10:39 +0100, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote: > >> How about just using MaxHeapTuplesPerPage? With the default 8K block > >> size, it's not that much more than 200, but makes the above gripes > >> completely go away. That seems like the safest option at this point. > > > > It would be much better to use a value for each table. Any constant > > value will be sub-optimal in many cases. > > Allocating extra memory doesn't usually do much harm, as long as you > don't actually use it. The reason we're now limiting it is to avoid Out > Of Memory errors if you're running with overcommit turned off, and > autovacuum triggers a vacuum on multiple tables at the same time. > > Let's keep it simple. Per-table setting would be much more complex and > would be something that the DBA would need to calculate and set. If you > really do run into this problem, you can just dial down > maintenance_work_mem. Much more complex? RelationGetAvgFSM(relation)?? > > Let's use our knowledge of the table to calculate a sensible value. We > > often have average row length available from last VACUUM, don't we? Use > > that, plus 10%. > > If there's dead line pointers in the table, left behind by HOT pruning, > the average row length would be completely bogus. Using the dead_tuples > stats directly would be more reasonable, but I would like us to keep > this even more conservative than that. That's a better argument. Since we have committed HOT, I would suggest we move the default of autovacuum_max_workers down to 2. That will limit the default amount of memory used by VACUUMs much better than trying to get the values precisely and we expect them to run much less frequently anyway now. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Re: [COMMITTERS] pgsql: Reduce the size ofmemoryallocations by lazy vacuum when
From
"Heikki Linnakangas"
Date:
Simon Riggs wrote: > On Mon, 2007-09-24 at 10:39 +0100, Heikki Linnakangas wrote: >> Let's keep it simple. Per-table setting would be much more complex and >> would be something that the DBA would need to calculate and set. If you >> really do run into this problem, you can just dial down >> maintenance_work_mem. > > Much more complex? RelationGetAvgFSM(relation)?? The code isn't complex, but the resulting behavior is. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when
From
Gregory Stark
Date:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Simon Riggs wrote: >> On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote: >>> How about just using MaxHeapTuplesPerPage? With the default 8K block >>> size, it's not that much more than 200, but makes the above gripes >>> completely go away. That seems like the safest option at this point. >> >> It would be much better to use a value for each table. Any constant >> value will be sub-optimal in many cases. > > Allocating extra memory doesn't usually do much harm, as long as you > don't actually use it. The reason we're now limiting it is to avoid Out > Of Memory errors if you're running with overcommit turned off, and > autovacuum triggers a vacuum on multiple tables at the same time. For reference, MaxHeapTuplesPerPage on an 8k block is 291. If there are any columns in your tuples (meaning they're not either HOT updates which have been pruned or rows with 8 or fewer columns all of which are null) then the most you can have is 255 rows. For the small difference between 200 and 291 it seems safer to just use MaxHeapTuplesPerPage. BS MHTPG Max w/data -------------------------- 4096 145 127 8192 291 255 16384 584 511 32768 1169 1023 -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when
From
Alvaro Herrera
Date:
Gregory Stark wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > > Simon Riggs wrote: > >> On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote: > >>> How about just using MaxHeapTuplesPerPage? With the default 8K block > >>> size, it's not that much more than 200, but makes the above gripes > >>> completely go away. That seems like the safest option at this point. > >> > >> It would be much better to use a value for each table. Any constant > >> value will be sub-optimal in many cases. > > > > Allocating extra memory doesn't usually do much harm, as long as you > > don't actually use it. The reason we're now limiting it is to avoid Out > > Of Memory errors if you're running with overcommit turned off, and > > autovacuum triggers a vacuum on multiple tables at the same time. > > For reference, MaxHeapTuplesPerPage on an 8k block is 291. If there are any > columns in your tuples (meaning they're not either HOT updates which have been > pruned or rows with 8 or fewer columns all of which are null) then the most > you can have is 255 rows. How about we change it to MaxHeapTuplesPerPage for now. That closes all complaints in this thread. Later we can think about better ways of doing the whole thing, like using non-lossy tidbitmaps. -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ "Para tener más hay que desear menos"
Re: Re: [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when
From
Alvaro Herrera
Date:
Alvaro Herrera wrote: > How about we change it to MaxHeapTuplesPerPage for now. That closes all > complaints in this thread. Done, thanks for the input! -- Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4 "La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)