Thread: Vacuuming strategy
On Tue, Apr 29, 2014 at 4:59 PM, Elanchezhiyan Elango <elanelango@gmail.com> wrote: [...] > With the above query pattern with intensive updates and deletes, I need to > do some aggressive vacuuming. > > Current strategy:I am running with default autovacuum settings (postgres > 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables > (Tables1 and Tables3) every night. But after a point, the 'vacuum full's > started timing out (with 4min timeout) every night. I think this is because > the table is growing bigger (~5GB) and doing a vacuum full every night is > probably not feasible. > > Going with the default autovacuum settings and not doing 'vacuum full' at > all is also not enough for my usecase. Whenever vacuum full succeeded every > night, it did seem to reclaim a considerable amount of space. So I assume, > autovacuum is not able to reclaim all space. It is able to reclaim space at the end of the table (truncate empty (no live tuples) pages in the tail), and depending on the autovacuum settings you might get the result that is very close to the full vacuum. Keep in mind that to reuse the space of deleted or updated tuples (on the low level (tuple level) any update is roughly delete+insert) vacuum/autovacuum should mark them as available first. So, to avoid bloat, you need your vacuum to manage this faster than your logic produces new dead tuples. The allowed bloat fraction can be controlled with _threashold and _scale_factor parameters. > What approach should I take? Do I require 'vacuum full'? What autovaccum > settings should I tweak so that I can avoid vacuum full, if possible, and > maintain a steady state without bloating the tables? I do not think you need vacuum full here. Start with the settings below. They are pretty aggressive, so, after some time, by analyzing logs (log_autovacuum_min_duration = 0 sets to log all autovacuum calls), you might came to a decision to ease these settings if it will be needed. log_autovacuum_min_duration = 0 autovacuum_max_workers = 5 autovacuum_naptime = 10s autovacuum_vacuum_scale_factor = 0.05 autovacuum_analyze_scale_factor = 0.05 autovacuum_vacuum_cost_delay = 5ms -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray.ru@gmail.com
Hi: On Wed, Apr 30, 2014 at 1:59 AM, Elanchezhiyan Elango <elanelango@gmail.com> wrote: > I need help on deciding my vacuuming strategy. I need to know if I ever need > to do 'vacuum full' for my tables. > > Tables1: Following is the query patterns on 4 high traffic table in my > database: > 1. Every 5 minutes about 50000 rows in the table are updated. And for a > given clock hour the same 50000 rows are updated again and again. > 2. When a new hour begins, another 50000 rows get inserted and they get > updated every 5 minutes. > 3. Every night a days worth of old stats are deleted. So this would be 24 * > 50000 = 1.2M records that get deleted every night. ....... Not a vacuuming strategy, but have you considered partitioning? I mean. Your pattern seems to be insert rows, update during the hour, delete daily. If you have a clear timestamp key, your problem seems to be well suited for it, specially given you seem to have an hourly process ( insert in step 2 ), and a daily process ( deletion in step 3 ). I can see two potential uses for partitions here. One is a classic, lets say a partition per day. In step 3 you drop the old partition and create the one for the next day, this is fast, and autovacuum will maintain the current table tidy enough. You could even complement it with full vacuuming the newly inserted partition. Something like, if you want to keep K days shortly after the start of day D ( not at exact midnight to avoid races ) you drop the partition for day D-K, create the one for D+1 ( the one for D was created beforehand, on D-1 to avoid races ) set it for adequate autovacuuming and, if you want, do a vacuum full and reindex for day D-1 ( which is now constant ). If you need to keep hundreds of days you could go for weekly or monthly partition, if you keep few you could consider hourly. Another one could be to make, say, daily partitions, but no rules / triggers to redirect inserts, just the constraint and inheritance rules to get proper selects, let the hourly insert ( and subsequent updates ) go to the main table. Then, on the hourly job, you can move all the now constant past-hour rows to the appropiate partition ( which just an "insert into part delete from main where adequate_range returning *" ). The partitions will not need reindex or vacuuming as they will become append-only, and the main table will have between 1 and two hour of records, and would probably be adequately managed with autovacuum, which given the table will be small should be fast. Old partitions could be dropped when unneeded as above. Francisco Olarte.
Hi,I need help on deciding my vacuuming strategy. I need to know if I ever need to do 'vacuum full' for my tables.Tables1: Following is the query patterns on 4 high traffic table in my database:1. Every 5 minutes about 50000 rows in the table are updated. And for a given clock hour the same 50000 rows are updated again and again.2. When a new hour begins, another 50000 rows get inserted and they get updated every 5 minutes.3. Every night a days worth of old stats are deleted. So this would be 24 * 50000 = 1.2M records that get deleted every night.
Tables3: Another 4 tables have the following pattern:1. Every 1 hour 50000 rows get updated. For a given day the same 50000 rows are updated again and again.2. When a new day begins, another 50000 rows get inserted and they get updated every hour.3. Every night 1.2M records get deleted.
Tables4: Another 39 tables have the following pattern:1. Every 1 hour 2000 rows gets updated. For a given day the same 2000 rows are updated again and again.2. When a new day begins, another 2000 rows get inserted and they get updated every hour.3. Every night 48000 rows get deleted.With the above query pattern with intensive updates and deletes, I need to do some aggressive vacuuming.Current strategy:I am running with default autovacuum settings (postgres 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables (Tables1 and Tables3) every night. But after a point, the 'vacuum full's started timing out (with 4min timeout) every night.
I think this is because the table is growing bigger (~5GB) and doing a vacuum full every night is probably not feasible.
Going with the default autovacuum settings and not doing 'vacuum full' at all is also not enough for my usecase. Whenever vacuum full succeeded every night, it did seem to reclaim a considerable amount of space.
What indexes exist? Are the updates to indexed columns?Each table has 4 indices. The updates are to the indexed columns.
Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?
If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.
Why do you have a 4 minute timeout? That seems counter-productive.
You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?
managed_target_stats=# select * from pgstattuple('xyz');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
------------+-------------+------------+---------------+------------------+----------------+--------------------+------------+--------------
5642526720 | 18880283 | 4042874387 | 71.65 | 122037 | 21495560 | 0.38 | 1239598044 | 21.97
This is one of Tables1 tables and this is after running for a week or so with the default autovacuum settings. The dead_tuple_percent look good. But the free_percent looks high. Is this normal? Also when I enabled autovacuum logs, I saw the autovacuums triggering very rarely. May be that's the reason for too much free space? I am going to try with the following settings:
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 40000
autovacuum_vacuum_cost_delay = 10ms
autovacuum_max_workers = 5
autovacuum_analyze_scale_factor = 0
autovacuum_analyze_threshold = 40000
Do these settings look ok? I am setting autovacuum_vacuum_scale_factor and autovacuum_analyze_threshold to 0 and the thresholds to a constant 40000. My reasoning is updates happen in a burst every 5 minutes and the upper bound of updates is 50000 every 5 minutes. So I have just harcoded a threshold of 40000. Also I am leaving autovacuum_naptime at the default of 1 min because updates anyways happen only every 5 minutes at the max. So I don't see a point in running autovacuum more frequently than 1min. Correct me if my settings look wrong.
Thanks,
Elan.
On Tue, Apr 29, 2014 at 4:59 PM, Elanchezhiyan Elango <elanelango@gmail.com> wrote:Hi,I need help on deciding my vacuuming strategy. I need to know if I ever need to do 'vacuum full' for my tables.Tables1: Following is the query patterns on 4 high traffic table in my database:1. Every 5 minutes about 50000 rows in the table are updated. And for a given clock hour the same 50000 rows are updated again and again.2. When a new hour begins, another 50000 rows get inserted and they get updated every 5 minutes.3. Every night a days worth of old stats are deleted. So this would be 24 * 50000 = 1.2M records that get deleted every night.What indexes exist? Are the updates to indexed columns? Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?Tables3: Another 4 tables have the following pattern:1. Every 1 hour 50000 rows get updated. For a given day the same 50000 rows are updated again and again.2. When a new day begins, another 50000 rows get inserted and they get updated every hour.3. Every night 1.2M records get deleted.If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.Tables4: Another 39 tables have the following pattern:1. Every 1 hour 2000 rows gets updated. For a given day the same 2000 rows are updated again and again.2. When a new day begins, another 2000 rows get inserted and they get updated every hour.3. Every night 48000 rows get deleted.With the above query pattern with intensive updates and deletes, I need to do some aggressive vacuuming.Current strategy:I am running with default autovacuum settings (postgres 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables (Tables1 and Tables3) every night. But after a point, the 'vacuum full's started timing out (with 4min timeout) every night.Why do you have a 4 minute timeout? That seems counter-productive.I think this is because the table is growing bigger (~5GB) and doing a vacuum full every night is probably not feasible.It is probably not necessary, but it certainly seems feasible. 4 min * 8 tables = 32 minutes. Call it one hour, since the 4 minute timeout has started not being enough. Is the 5GB for the table plus indexes, or just the table itself?Going with the default autovacuum settings and not doing 'vacuum full' at all is also not enough for my usecase. Whenever vacuum full succeeded every night, it did seem to reclaim a considerable amount of space.You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?Cheers,Jeff
Is the 5GB for the table plus indexes, or just the table itself?
Sergey,Thanks for the aggressive settings. I have listed some settings I am planning to try below. Please review and let me know your feedback.Francisco,Thanks for the partitioning idea. I used to have the tables partitioned. But now that I have moved to a schema where data is split across about ~90 tables I have moved away from partitioning. But it's something I have to reconsider at least for the high traffic tables.Each table has 4 indices. The updates are to the indexed columns.What indexes exist? Are the updates to indexed columns?Here is the schema of a table: http://pastebin.com/ffu0dUjS All tables have this same schema except that some tables don't have a port column and so will have one less index.Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?Tables1, Tables2: Anything older than 1 month is deleted.Tables3, Tables4: Anything older than 3 months is deleted.If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.Sorry that was wrong.For Tables3 it should have been 50000 rows (not 1.2M) getting deleted daily.And for Tables4 it should have been 2000 rows (not 48000) getting deleted daily.Why do you have a 4 minute timeout? That seems counter-productive.Oh, Is it less or more?You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?I haven't tested to see if the space keeps on increasing.I did pgstattupe() on one of the tables:managed_target_stats=# select * from pgstattuple('xyz');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
------------+-------------+------------+---------------+------------------+----------------+--------------------+------------+--------------
5642526720 | 18880283 | 4042874387 | 71.65 | 122037 | 21495560 | 0.38 | 1239598044 | 21.97
This is one of Tables1 tables and this is after running for a week or so with the default autovacuum settings. The dead_tuple_percent look good. But the free_percent looks high. Is this normal? Also when I enabled autovacuum logs, I saw the autovacuums triggering very rarely. May be that's the reason for too much free space? I am going to try with the following settings:
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 40000
autovacuum_vacuum_cost_delay = 10ms
autovacuum_max_workers = 5
autovacuum_analyze_scale_factor = 0
autovacuum_analyze_threshold = 40000
Do these settings look ok? I am setting autovacuum_vacuum_scale_factor and autovacuum_analyze_threshold to 0 and the thresholds to a constant 40000. My reasoning is updates happen in a burst every 5 minutes and the upper bound of updates is 50000 every 5 minutes. So I have just harcoded a threshold of 40000. Also I am leaving autovacuum_naptime at the default of 1 min because updates anyways happen only every 5 minutes at the max. So I don't see a point in running autovacuum more frequently than 1min. Correct me if my settings look wrong.
Thanks,
Elan.
On Wed, Apr 30, 2014 at 8:36 AM, Jeff Janes <jeff.janes@gmail.com> wrote:On Tue, Apr 29, 2014 at 4:59 PM, Elanchezhiyan Elango <elanelango@gmail.com> wrote:Hi,I need help on deciding my vacuuming strategy. I need to know if I ever need to do 'vacuum full' for my tables.Tables1: Following is the query patterns on 4 high traffic table in my database:1. Every 5 minutes about 50000 rows in the table are updated. And for a given clock hour the same 50000 rows are updated again and again.2. When a new hour begins, another 50000 rows get inserted and they get updated every 5 minutes.3. Every night a days worth of old stats are deleted. So this would be 24 * 50000 = 1.2M records that get deleted every night.What indexes exist? Are the updates to indexed columns? Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?Tables3: Another 4 tables have the following pattern:1. Every 1 hour 50000 rows get updated. For a given day the same 50000 rows are updated again and again.2. When a new day begins, another 50000 rows get inserted and they get updated every hour.3. Every night 1.2M records get deleted.If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.Tables4: Another 39 tables have the following pattern:1. Every 1 hour 2000 rows gets updated. For a given day the same 2000 rows are updated again and again.2. When a new day begins, another 2000 rows get inserted and they get updated every hour.3. Every night 48000 rows get deleted.With the above query pattern with intensive updates and deletes, I need to do some aggressive vacuuming.Current strategy:I am running with default autovacuum settings (postgres 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables (Tables1 and Tables3) every night. But after a point, the 'vacuum full's started timing out (with 4min timeout) every night.Why do you have a 4 minute timeout? That seems counter-productive.I think this is because the table is growing bigger (~5GB) and doing a vacuum full every night is probably not feasible.It is probably not necessary, but it certainly seems feasible. 4 min * 8 tables = 32 minutes. Call it one hour, since the 4 minute timeout has started not being enough. Is the 5GB for the table plus indexes, or just the table itself?Going with the default autovacuum settings and not doing 'vacuum full' at all is also not enough for my usecase. Whenever vacuum full succeeded every night, it did seem to reclaim a considerable amount of space.You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?Cheers,Jeff
Each table has 4 indices. The updates are to the indexed columns.Here is the schema of a table: http://pastebin.com/ffu0dUjS All tables have this same schema except that some tables don't have a port column and so will have one less index
What indexes exist? Are the updates to indexed columns?
Missed to answer this one:Is the 5GB for the table plus indexes, or just the table itself?No it's not including the the indices. Including indices it's actually 17GB!!On Wed, Apr 30, 2014 at 10:40 AM, Elanchezhiyan Elango <elanelango@gmail.com> wrote:Sergey,Thanks for the aggressive settings. I have listed some settings I am planning to try below. Please review and let me know your feedback.Francisco,Thanks for the partitioning idea. I used to have the tables partitioned. But now that I have moved to a schema where data is split across about ~90 tables I have moved away from partitioning. But it's something I have to reconsider at least for the high traffic tables.Each table has 4 indices. The updates are to the indexed columns.What indexes exist? Are the updates to indexed columns?Here is the schema of a table: http://pastebin.com/ffu0dUjS All tables have this same schema except that some tables don't have a port column and so will have one less index.Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?Tables1, Tables2: Anything older than 1 month is deleted.Tables3, Tables4: Anything older than 3 months is deleted.If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.Sorry that was wrong.For Tables3 it should have been 50000 rows (not 1.2M) getting deleted daily.And for Tables4 it should have been 2000 rows (not 48000) getting deleted daily.Why do you have a 4 minute timeout? That seems counter-productive.Oh, Is it less or more?You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?I haven't tested to see if the space keeps on increasing.I did pgstattupe() on one of the tables:managed_target_stats=# select * from pgstattuple('xyz');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
------------+-------------+------------+---------------+------------------+----------------+--------------------+------------+--------------
5642526720 | 18880283 | 4042874387 | 71.65 | 122037 | 21495560 | 0.38 | 1239598044 | 21.97
This is one of Tables1 tables and this is after running for a week or so with the default autovacuum settings. The dead_tuple_percent look good. But the free_percent looks high. Is this normal? Also when I enabled autovacuum logs, I saw the autovacuums triggering very rarely. May be that's the reason for too much free space? I am going to try with the following settings:
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 40000
autovacuum_vacuum_cost_delay = 10ms
autovacuum_max_workers = 5
autovacuum_analyze_scale_factor = 0
autovacuum_analyze_threshold = 40000
Do these settings look ok? I am setting autovacuum_vacuum_scale_factor and autovacuum_analyze_threshold to 0 and the thresholds to a constant 40000. My reasoning is updates happen in a burst every 5 minutes and the upper bound of updates is 50000 every 5 minutes. So I have just harcoded a threshold of 40000. Also I am leaving autovacuum_naptime at the default of 1 min because updates anyways happen only every 5 minutes at the max. So I don't see a point in running autovacuum more frequently than 1min. Correct me if my settings look wrong.
Thanks,
Elan.
On Wed, Apr 30, 2014 at 8:36 AM, Jeff Janes <jeff.janes@gmail.com> wrote:On Tue, Apr 29, 2014 at 4:59 PM, Elanchezhiyan Elango <elanelango@gmail.com> wrote:Hi,I need help on deciding my vacuuming strategy. I need to know if I ever need to do 'vacuum full' for my tables.Tables1: Following is the query patterns on 4 high traffic table in my database:1. Every 5 minutes about 50000 rows in the table are updated. And for a given clock hour the same 50000 rows are updated again and again.2. When a new hour begins, another 50000 rows get inserted and they get updated every 5 minutes.3. Every night a days worth of old stats are deleted. So this would be 24 * 50000 = 1.2M records that get deleted every night.What indexes exist? Are the updates to indexed columns? Which day's tuples are deleted every night? Is it the day a week ago, or a month ago, or something else?Tables3: Another 4 tables have the following pattern:1. Every 1 hour 50000 rows get updated. For a given day the same 50000 rows are updated again and again.2. When a new day begins, another 50000 rows get inserted and they get updated every hour.3. Every night 1.2M records get deleted.If only 50,000 get inserted daily and 1.2M get deleted, this table will soon be empty! I think you said daily when you meant hourly somewhere in there.Tables4: Another 39 tables have the following pattern:1. Every 1 hour 2000 rows gets updated. For a given day the same 2000 rows are updated again and again.2. When a new day begins, another 2000 rows get inserted and they get updated every hour.3. Every night 48000 rows get deleted.With the above query pattern with intensive updates and deletes, I need to do some aggressive vacuuming.Current strategy:I am running with default autovacuum settings (postgres 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables (Tables1 and Tables3) every night. But after a point, the 'vacuum full's started timing out (with 4min timeout) every night.Why do you have a 4 minute timeout? That seems counter-productive.I think this is because the table is growing bigger (~5GB) and doing a vacuum full every night is probably not feasible.It is probably not necessary, but it certainly seems feasible. 4 min * 8 tables = 32 minutes. Call it one hour, since the 4 minute timeout has started not being enough. Is the 5GB for the table plus indexes, or just the table itself?Going with the default autovacuum settings and not doing 'vacuum full' at all is also not enough for my usecase. Whenever vacuum full succeeded every night, it did seem to reclaim a considerable amount of space.You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?Cheers,Jeff
On Wed, Apr 30, 2014 at 10:40 AM, Elanchezhiyan Elango <elanelango@gmail.com> wrote: > log_autovacuum_min_duration = 0 > > autovacuum_vacuum_scale_factor = 0 > autovacuum_vacuum_threshold = 40000 > autovacuum_analyze_scale_factor = 0 > autovacuum_analyze_threshold = 40000 I don't think it is a good idea to set scale factor and this high threshold instance wide. You can try per table settings instead if you want. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray.ru@gmail.com
Hi: On Wed, Apr 30, 2014 at 7:40 PM, Elanchezhiyan Elango <elanelango@gmail.com> wrote: > Francisco, > Thanks for the partitioning idea. I used to have the tables partitioned. But > now that I have moved to a schema where data is split across about ~90 > tables I have moved away from partitioning. But it's something I have to > reconsider at least for the high traffic tables. I've discovered paritioning lends itself to some nifty tricks on high traffic tables, specially when data is partitioned by a timestamp key correlated with insertion times ( in our case these are CDRs ). Aside from the easiness of dropping the old partitions and not having to vaccum them, I've been testing something with a couple of tricks. Like dropping indexes on old partitions ( which are no longer queried frequently and whose queries normally always get a full table scan ), or moving no longer updates partitions to a different archived schema ( they still get queried, inheritance takes care of this, but this unclutters my table listings and lets me do a backup optimization. I instruct pg_dump to dump the non-updating schema only whenever I change it and all but the updating one more frequently ( these is for pg_dump backups, for which I keep a rotating series to protect again pilot / software errors and to restore them in auxiliary servers fr testing / data mining, for disk failure we have a separate replication / log archiving setup ). These division also aided me when I had to do a 9.1-9.3 updgrade, as we only needed to dump/restore the updating schemas in the downtime window and then redid the archived one after it. Of course I need a full set of scripts to maintiain this, and if you decide to make one of this you'll need an even bigger one. Also, what puzzles me is your insert/update/delete pattern of access. When I've found this I normally have just insert / delete. It seems like you are doing cumulative hourly stats but updating them in place. If this is your case I've found myself doing it ( daily stats updated in place, to have the current day view growing on 'real' time ) and switched to an schema where I inserted into an aux table, which was queried using sum()/avg(), and added and moved the totals once the day was done to another table. It was easier on vacuums, as the daily table just needed a daily vaccum after the big delete, and not even a full one, as the space was going to be reused ( this was before partitioning, and I used a view to query a union of the totals table plus an agregated query of the incremental one, it worked quite well ) Francisco Olarte.
Hi,I need help on deciding my vacuuming strategy. I need to know if I ever need to do 'vacuum full' for my tables.
Why do you have a 4 minute timeout? That seems counter-productive.Oh, Is it less or more?
You delete a bunch of tuples every night, so of course a vacuum full after that is going to return a lot of space. But that space is probably just going to be needed again the next day. If you don't do the vacuum full, does the *peak* space keep increasing, or does it stabilize?I haven't tested to see if the space keeps on increasing.I did pgstattupe() on one of the tables:managed_target_stats=# select * from pgstattuple('xyz');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
------------+-------------+------------+---------------+------------------+----------------+--------------------+------------+--------------
5642526720 | 18880283 | 4042874387 | 71.65 | 122037 | 21495560 | 0.38 | 1239598044 | 21.97
This is one of Tables1 tables and this is after running for a week or so with the default autovacuum settings. The dead_tuple_percent look good. But the free_percent looks high. Is this normal?
Also when I enabled autovacuum logs, I saw the autovacuums triggering very rarely. May be that's the reason for too much free space? I am going to try with the following settings:
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 40000
Missed to answer this one:Is the 5GB for the table plus indexes, or just the table itself?No it's not including the the indices. Including indices it's actually 17GB!!
Each table has 4 indices. The updates are to the indexed columns.Here is the schema of a table: http://pastebin.com/ffu0dUjS All tables have this same schema except that some tables don't have a port column and so will have one less indexWhat indexes exist? Are the updates to indexed columns?Sorry I was wrong when I said the updates are to the indexed columns. The updates are to the 'data[]' column which doesn't have any index.