Thread: Growing Database Size
Hi All, I've got a bit of a dilemma with growing database sizes. Before upgrading to 7.3.2 my 7.3.0 database had an average size of 1.4-2.0gb. I have a large number of LO records in my db and presumed the size growth was because of them. However I do have the appropriate trigger functions on my LO records to ensure that lo_manage runs to unlink the old reference etc. Since moving to 7.3.2, even though I VACUUM regularly, my DB size has increased to 3.2GB on disk!! The raw data only consumes about 700-900mb so I feel that an additional 2.2gb for index and statistics info is a bit much! Any clues from the experienced among the group? PS - strange thing is that even with that huge change on disk PG's performance hasn't really been affected... weird. Thanks, Jason
On Sat, Jun 14, 2003 at 16:35:16 +1000, Jason Godden <jasongodden@optushome.com.au> wrote: > Hi All, > > I've got a bit of a dilemma with growing database sizes. Before upgrading to > 7.3.2 my 7.3.0 database had an average size of 1.4-2.0gb. I have a large > number of LO records in my db and presumed the size growth was because of > them. However I do have the appropriate trigger functions on my LO records > to ensure that lo_manage runs to unlink the old reference etc. > > Since moving to 7.3.2, even though I VACUUM regularly, my DB size has > increased to 3.2GB on disk!! The raw data only consumes about 700-900mb so I > feel that an additional 2.2gb for index and statistics info is a bit much! > Any clues from the experienced among the group? > > PS - strange thing is that even with that huge change on disk PG's performance You probably want to use a reindex. If you have an index on a column that only increases or decreases over time, the index won't be able to use reuse pages that are no longer in use. This is fixed in 7.4.
> > You probably want to use a reindex. If you have an index on a column > that only increases or decreases over time, the index won't be able > to use reuse pages that are no longer in use. This is fixed in 7.4. > I'm kinda having the same type of problem, I'm updating a entire database nightly with..... BEGIN DELETE (deleting all rows from a table) DELETE (deleting all rows from a table) COPY (populating with new rows) COPY (populating with new rows) COMMIT VACUUM (analyze) Since I don't fully understand the whole "FSM" settings and can find "NO" doc's or "ANY" info on how to tweak these settings besides an email now and then saying to up them. Not only that but (as of a few days ago) there seems to be no way to search prior posts on this topic. The archives seem to be off line, last time I tried it seems my IP address wasn't in the "pg_hba.conf" file and google SITE: searchs don't work either, "The page can not be found" errors. So my database just grows and grows and added to my monthly list of todos, I take the database off line into single user mode and reindex and vacuum full. Am I missing something? Does anyone know where to find any doc's on how to adjust the FSM settings? I usally try to do my home work before asking these questions, but..... Any help in understanding this process would be great, I think it would help others too, Have you checked the size of your database lately? Or is all this stuff updated in the newer versions, I'm running 7.2.1.
On Sat, 14 Jun 2003, Kenneth Godee wrote: > > > > You probably want to use a reindex. If you have an index on a column > > that only increases or decreases over time, the index won't be able > > to use reuse pages that are no longer in use. This is fixed in 7.4. > > > > I'm kinda having the same type of problem, I'm updating a entire > database nightly with..... > > BEGIN > DELETE (deleting all rows from a table) > DELETE (deleting all rows from a table) > COPY (populating with new rows) > COPY (populating with new rows) > COMMIT > VACUUM (analyze) > > Since I don't fully understand the whole "FSM" settings and > can find "NO" doc's or "ANY" info on how to tweak these settings > besides an email now and then saying to up them. Not only that but > (as of a few days ago) there seems to be no way to search prior posts > on this topic. The archives seem to be off line, last time I tried it seems > my IP address wasn't in the "pg_hba.conf" file and google SITE: searchs don't work > either, "The page can not be found" errors. > So my database just grows and grows and added to my monthly list of todos, I take the > database off line into single user mode and reindex and vacuum full. > > Am I missing something? Does anyone know where to find any doc's on how to > adjust the FSM settings? > I usally try to do my home work before asking these questions, but..... Well is that a vacuum or a vacuum analyse you're doing after that nightly delete/copy sequence? You need to run vacuum to stand any chance of your fsm settings actually helping you since the fsm is for tracking freed space in the file store and vacuum is what does that scan. At least I don't think vacuum analyse does but I may well be mistaken on that (pretty basic and I really should look at those details some time). The copy's are just going to be sticking every tuple after the space for all the delete tuples without that. Also, why not use a delete, vacuum full, copy sequence? I see you're doing this in a transaction so you probably don't want the alternative of truncate, copy which I think, but don't know, will truncate the file store. Also, is there some reason you're not wanting to run vacuum full more frequently? > > Any help in understanding this process would be great, I think it would help others too, FSM stands for Free Space Map. The space taken by deleted tuples is automatically reused unless that space is recorded in the fsm. Vacuum fills in the fsm and vacuum full moves undeleted tuples into space taken by deleted tuples (without using the fsm). > Have you checked the size of your database lately? > > Or is all this stuff updated in the newer versions, I'm running 7.2.1. Upgrade to 7.2.4, it should be painless and it's highly recommended. Well if you're not reusing the same range of indexed values for each new load of data I would expect you'll find the disk usage increasing for those indexes. There isn't a way to avoid the periodic reindex if you want to maintain reasonable disk usage in that case. I'm not sure if that's different for 7.3 or 7.4. -- Nigel J. Andrews
On 14 Jun 2003 at 13:50, Kenneth Godee wrote: > > > > You probably want to use a reindex. If you have an index on a column > > that only increases or decreases over time, the index won't be able > > to use reuse pages that are no longer in use. This is fixed in 7.4. > > > > I'm kinda having the same type of problem, I'm updating a entire > database nightly with..... > > BEGIN > DELETE (deleting all rows from a table) > DELETE (deleting all rows from a table) > COPY (populating with new rows) > COPY (populating with new rows) > COMMIT > VACUUM (analyze) Wow.. That would really kill the things. How about begin drop table a; drop table b; create table a; create table b; copy -> table a; copy ->table b; commit; vacuum analyze; I can almost bet that this would be faster than solution above as it would not churn the disk as much, put simply. Hmm..Just wondering, will truncate in above steps would do the same job? > Am I missing something? Does anyone know where to find any doc's on how to > adjust the FSM settings? > I usally try to do my home work before asking these questions, but..... This is out of my head so take it with a grain of salt. Correct me if I am wrong. no. of FSM relations = number of relations/tables in database that gets populated no. of fsm pages in 8KB blocks = space required to store each deleted page. So if you delete/update 100K pages, for each page you need to give some 4 byte entry in FSM. Unless you have big enough FSM map, vacuum won't be as effective as it won't be able to track all the space wasted. HTH Bye Shridhar -- Peers's Law: The solution to a problem changes the nature of the problem.
On Sun, Jun 15, 2003 at 04:05:59PM +0530, Shridhar Daithankar wrote: > On 14 Jun 2003 at 13:50, Kenneth Godee wrote: > How about > > begin > drop table a; > drop table b; > create table a; > create table b; > copy -> table a; > copy ->table b; > commit; > vacuum analyze; > > I can almost bet that this would be faster than solution above as it would not > churn the disk as much, put simply. > > Hmm..Just wondering, will truncate in above steps would do the same job? Yes, truncate table would do the same job and more importantly, wouldn't go mucking about in the catalog tables. Assuming it's available in 7.2, you should absolutely use truncate instead of drop/create (man I wish more databases supported truncate). -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
"Jim C. Nasby" <jim@nasby.net> writes: > On Sun, Jun 15, 2003 at 04:05:59PM +0530, Shridhar Daithankar wrote: >> On 14 Jun 2003 at 13:50, Kenneth Godee wrote: >> begin >> drop table a; >> drop table b; >> create table a; >> create table b; >> copy -> table a; >> copy ->table b; >> commit; >> vacuum analyze; >> Hmm..Just wondering, will truncate in above steps would do the same job? > Yes, truncate table would do the same job and more importantly, wouldn't > go mucking about in the catalog tables. Assuming it's available in 7.2, > you should absolutely use truncate instead of drop/create (man I wish > more databases supported truncate). However, the originally given sequence is transaction-safe (ie, on failure it will roll back to the original state), whereas something involving TRUNCATE will not be. Until 7.4 anyway. Shridhar's drop/recreate is transaction-safe and looks like a good solution if you're not worried about foreign key linkages and such. regards, tom lane