Thread: db growing out of proportion
Hello everybody, I'm facing a simple yet gravely problem with postgresql 7.3.2 on x86 Linux. My db is used to store IP accounting statistics for about 30 C's. There are a couple truly trivial tables such as the one below: CREATE TABLE stats_min ( ip inet NOT NULL, start timestamp NOT NULL default CURRENT_TIMESTAMP(0), intlen int4 NOT NULL default 60, d_in int8 NOT NULL, d_out int8 NOT NULL, constraint "stats_min_pkey" PRIMARY KEY ("ip", "start") ); CREATE INDEX stats_min_start ON stats_min (start); A typical transaction committed on these tables looks like this: BEGIN WORK DELETE ... UPDATE/INSERT ... COMMIT WORK Trouble is, as the rows in the tables get deleted/inserted/updated (the frequency being a couple thousand rows per minute), the database is growing out of proportion in size. After about a week, I have to redump the db by hand so as to get query times back to sensible figures. A transaction that takes ~50 seconds before the redump will then complete in under 5 seconds (the corresponding data/base/ dir having shrunk from ~2 GB to ~0.6GB). A nightly VACCUM ANALYZE is no use. A VACUUM FULL is no use. A VACUUM FULL followed by REINDEX is no use. It seems that only a full redump involving "pg_dump olddb | \ psql newdb" is capable of restoring the system to its working glory. Please accept my apologies if I've overlooked a relevant piece of information in the docs. I'm in an urgent need of getting this problem resolved. -- Tomas Szepe <szepe@pinerecords.com>
On Thu, 29 May 2003, Tomas Szepe wrote: > Hello everybody, > > I'm facing a simple yet gravely problem with postgresql 7.3.2 on x86 Linux. > My db is used to store IP accounting statistics for about 30 C's. There are > a couple truly trivial tables such as the one below: > > CREATE TABLE stats_min > ( > ip inet NOT NULL, > start timestamp NOT NULL default CURRENT_TIMESTAMP(0), > intlen int4 NOT NULL default 60, > d_in int8 NOT NULL, > d_out int8 NOT NULL, > > constraint "stats_min_pkey" PRIMARY KEY ("ip", "start") > ); > CREATE INDEX stats_min_start ON stats_min (start); > > A typical transaction committed on these tables looks like this: > > BEGIN WORK > DELETE ... > UPDATE/INSERT ... > COMMIT WORK > > Trouble is, as the rows in the tables get deleted/inserted/updated > (the frequency being a couple thousand rows per minute), the database > is growing out of proportion in size. After about a week, I have > to redump the db by hand so as to get query times back to sensible > figures. A transaction that takes ~50 seconds before the redump will > then complete in under 5 seconds (the corresponding data/base/ dir having > shrunk from ~2 GB to ~0.6GB). > > A nightly VACCUM ANALYZE is no use. > > A VACUUM FULL is no use. > > A VACUUM FULL followed by REINDEX is no use. Is the space being taken up by stats_min, this index, some other object? I'm not 100% sure, but after vacuums maybe select * from pg_class order by relpages desc limit 10; will give a good idea. What does VACUUM FULL VERBOSE stats_min; give you?
> [sszabo@megazone23.bigpanda.com] > > > Trouble is, as the rows in the tables get deleted/inserted/updated > > (the frequency being a couple thousand rows per minute), the database > > is growing out of proportion in size. After about a week, I have > > to redump the db by hand so as to get query times back to sensible > > figures. A transaction that takes ~50 seconds before the redump will > > then complete in under 5 seconds (the corresponding data/base/ dir having > > shrunk from ~2 GB to ~0.6GB). > > > > A nightly VACCUM ANALYZE is no use. > > > > A VACUUM FULL is no use. > > > > A VACUUM FULL followed by REINDEX is no use. > > Is the space being taken up by stats_min, this index, some other object? relname | relkind | relpages | reltuples ---------------------------------+---------+----------+------------- stats_hr | r | 61221 | 3.01881e+06 stats_hr_pkey | i | 26414 | 3.02239e+06 stats_min_pkey | i | 20849 | 953635 stats_hr_start | i | 17218 | 3.02142e+06 stats_min_start | i | 15284 | 949788 stats_min | r | 10885 | 948792 authinfo_pkey | i | 1630 | 1342 authinfo | r | 1004 | 1342 contract_ips | r | 865 | 565 contract_ips_pkey | i | 605 | 565 > What does VACUUM FULL VERBOSE stats_min; give you? Sorry, I can't run a VACUUM FULL at this time. We're in production use. -- Tomas Szepe <szepe@pinerecords.com>
On Fri, 30 May 2003, Tomas Szepe wrote: > > [sszabo@megazone23.bigpanda.com] > > > > > Trouble is, as the rows in the tables get deleted/inserted/updated > > > (the frequency being a couple thousand rows per minute), the database > > > is growing out of proportion in size. After about a week, I have > > > to redump the db by hand so as to get query times back to sensible > > > figures. A transaction that takes ~50 seconds before the redump will > > > then complete in under 5 seconds (the corresponding data/base/ dir having > > > shrunk from ~2 GB to ~0.6GB). > > > > > > A nightly VACCUM ANALYZE is no use. > > > > > > A VACUUM FULL is no use. > > > > > > A VACUUM FULL followed by REINDEX is no use. > > > > Is the space being taken up by stats_min, this index, some other object? > > relname | relkind | relpages | reltuples > ---------------------------------+---------+----------+------------- > stats_hr | r | 61221 | 3.01881e+06 > stats_hr_pkey | i | 26414 | 3.02239e+06 > stats_min_pkey | i | 20849 | 953635 > stats_hr_start | i | 17218 | 3.02142e+06 > stats_min_start | i | 15284 | 949788 > stats_min | r | 10885 | 948792 > authinfo_pkey | i | 1630 | 1342 > authinfo | r | 1004 | 1342 > contract_ips | r | 865 | 565 > contract_ips_pkey | i | 605 | 565 > > > What does VACUUM FULL VERBOSE stats_min; give you? > > Sorry, I can't run a VACUUM FULL at this time. > We're in production use. > > Would more regular vacuum help. I think a vaccum every hour may do the job. perhaps with an analyse every day. (I presume the statistics don't change too much) While I don't surgest doing a vacuum more than twice an hour as this would slow down the system with little gain more than once a day may improve the speed and space usage. Just an idea. Peter
Peter Childs <blue.dragon@blueyonder.co.uk> writes: > On Fri, 30 May 2003, Tomas Szepe wrote: >> Trouble is, as the rows in the tables get deleted/inserted/updated >> (the frequency being a couple thousand rows per minute), the database >> is growing out of proportion in size. > Would more regular vacuum help. I think a vaccum every hour may do > the job. Also note that no amount of vacuuming will save you if the FSM is not large enough to keep track of all the free space. The default FSM settings, like all the other default settings in Postgres, are set up for a small installation. You'd probably need to raise them by at least a factor of 10 for this installation. regards, tom lane
I have a database with similar performance constraints. Our best estimates put the turnover on our most active table at 350k tuples/day. The hardware is a 4x1.4GHz Xeon w/ a RAID 1 disk setup, and the DB floats around 500MB of disk space taken. Here is what we do to maintain operations: 1) Cron job @ 4:00AM that runs a full vacuum analyze on the DB, and reindex on the major tables. (Reindex is to maintain index files in SHM) An alerting feature pages the administrator if the job does not complete within a reasonable amount of time. 2) Every 15 minutes, a cron job runs a vacuum analyze on our five largest tables. An alert is emailed to the administrator if a second vacuum attempts to start before the previous completes. 3) Every week, we review the disk usage numbers from daily peaks. This determines if we need to increase our shmmax & shared buffers. Additionally, you may want to take a look at your query performance. Are most of your queries doing sequential scans? In my system, the crucial columns of the primary tables are int8 and float8 fields. I have those indexed, and I get a serious performance boost by making sure all SELECT/UPDATE/DELETE queries that use those columns in the WHERE have an explicit ::int8 or ::float8 (Explain analyze is your friend). During peak usage, there is an order of magnitude difference (usually 10 to 15x) between queries doing sequential scans on the table, and queries doing index scans. Might be worth investigating if your queries are taking 5 seconds when your DB is fresh. HTH. Tomas Szepe wrote: > Hello everybody, > > I'm facing a simple yet gravely problem with postgresql 7.3.2 on x86 Linux. > My db is used to store IP accounting statistics for about 30 C's. There are > a couple truly trivial tables such as the one below: > > CREATE TABLE stats_min > ( > ip inet NOT NULL, > start timestamp NOT NULL default CURRENT_TIMESTAMP(0), > intlen int4 NOT NULL default 60, > d_in int8 NOT NULL, > d_out int8 NOT NULL, > > constraint "stats_min_pkey" PRIMARY KEY ("ip", "start") > ); > CREATE INDEX stats_min_start ON stats_min (start); > > A typical transaction committed on these tables looks like this: > > BEGIN WORK > DELETE ... > UPDATE/INSERT ... > COMMIT WORK > > Trouble is, as the rows in the tables get deleted/inserted/updated > (the frequency being a couple thousand rows per minute), the database > is growing out of proportion in size. After about a week, I have > to redump the db by hand so as to get query times back to sensible > figures. A transaction that takes ~50 seconds before the redump will > then complete in under 5 seconds (the corresponding data/base/ dir having > shrunk from ~2 GB to ~0.6GB). > > A nightly VACCUM ANALYZE is no use. > > A VACUUM FULL is no use. > > A VACUUM FULL followed by REINDEX is no use. > > It seems that only a full redump involving "pg_dump olddb | \ > psql newdb" is capable of restoring the system to its working > glory. > > Please accept my apologies if I've overlooked a relevant piece of > information in the docs. I'm in an urgent need of getting this > problem resolved. >
On Fri, 30 May 2003, Tomas Szepe wrote: > > [sszabo@megazone23.bigpanda.com] > > > > > Trouble is, as the rows in the tables get deleted/inserted/updated > > > (the frequency being a couple thousand rows per minute), the database > > > is growing out of proportion in size. After about a week, I have > > > to redump the db by hand so as to get query times back to sensible > > > figures. A transaction that takes ~50 seconds before the redump will > > > then complete in under 5 seconds (the corresponding data/base/ dir having > > > shrunk from ~2 GB to ~0.6GB). > > > > > > A nightly VACCUM ANALYZE is no use. > > > > > > A VACUUM FULL is no use. > > > > > > A VACUUM FULL followed by REINDEX is no use. > > > > Is the space being taken up by stats_min, this index, some other object? > > relname | relkind | relpages | reltuples > ---------------------------------+---------+----------+------------- > stats_hr | r | 61221 | 3.01881e+06 > stats_hr_pkey | i | 26414 | 3.02239e+06 > stats_min_pkey | i | 20849 | 953635 > stats_hr_start | i | 17218 | 3.02142e+06 > stats_min_start | i | 15284 | 949788 > stats_min | r | 10885 | 948792 > authinfo_pkey | i | 1630 | 1342 > authinfo | r | 1004 | 1342 > contract_ips | r | 865 | 565 > contract_ips_pkey | i | 605 | 565 > > > What does VACUUM FULL VERBOSE stats_min; give you? > > Sorry, I can't run a VACUUM FULL at this time. > We're in production use. As Tom said, you probably need higher FSM settings, but also, do you have any long lived transactions (say from some kind of persistent connection system) that might be preventing vacuum from removing rows?
> [tgl@sss.pgh.pa.us] > > Peter Childs <blue.dragon@blueyonder.co.uk> writes: > > On Fri, 30 May 2003, Tomas Szepe wrote: > >> Trouble is, as the rows in the tables get deleted/inserted/updated > >> (the frequency being a couple thousand rows per minute), the database > >> is growing out of proportion in size. > > > Would more regular vacuum help. I think a vaccum every hour may do > > the job. > > Also note that no amount of vacuuming will save you if the FSM is not > large enough to keep track of all the free space. The default FSM > settings, like all the other default settings in Postgres, are set up > for a small installation. You'd probably need to raise them by at least > a factor of 10 for this installation. Thanks, I'll try to tweak those settings and will let the list know how things went. -- Tomas Szepe <szepe@pinerecords.com>
> As Tom said, you probably need higher FSM settings, but also, do you have > any long lived transactions (say from some kind of persistent connection > system) that might be preventing vacuum from removing rows? No, not at all. -- Tomas Szepe <szepe@pinerecords.com>
> [todd@twopunks.org] > > Additionally, you may want to take a look at your query performance. Are > most of your queries doing sequential scans? In my system, the crucial > columns of the primary tables are int8 and float8 fields. I have those > indexed, and I get a serious performance boost by making sure all > SELECT/UPDATE/DELETE queries that use those columns in the WHERE have an > explicit ::int8 or ::float8 (Explain analyze is your friend). During > peak usage, there is an order of magnitude difference (usually 10 to > 15x) between queries doing sequential scans on the table, and queries > doing index scans. Might be worth investigating if your queries are > taking 5 seconds when your DB is fresh. HTH. Yes, I have taken special care to fine-tune all queries on authentic data. The db setup works as expected in whatever respect with the exception of query times deterioration that apparently corelates to the db's on-disk size growth. Thanks for your suggestions, -- Tomas Szepe <szepe@pinerecords.com>
On Fri, 30 May 2003 09:11:39 -0400 Tom Lane <tgl@sss.pgh.pa.us> said something like: > > Also note that no amount of vacuuming will save you if the FSM is not > large enough to keep track of all the free space. The default FSM > settings, like all the other default settings in Postgres, are set up > for a small installation. You'd probably need to raise them by at least > a factor of 10 for this installation. > Tom, Thanks for the hint. I just upped my shared_buffers to 8192, fsm_relations to 10000, fsm_pages to 100000, sort_mem to 64000,and an UPDATE which was taking over 2 hours dropped down to 1 to 2 minutes! Nice... Thanks, Rob
Robert Creager <Robert_Creager@LogicalChaos.org> writes: > Thanks for the hint. I just upped my shared_buffers to 8192, fsm_relations to 10000, fsm_pages to 100000, sort_mem to64000, and an UPDATE which was taking over 2 hours dropped down to 1 to 2 minutes! Cool ... but it's not immediately obvious which of these changes did the trick for you. What settings were you at before? And what's the details of the problem query? The first three settings you mention all seem like reasonable choices, but I'd be hesitant to recommend 64M sort_mem for general use (it won't take very many concurrent sorts to drive you into the ground...). So I'm interested to narrow down exactly what was the issue here. regards, tom lane
On Sat, 31 May 2003 00:11:26 -0400 Tom Lane <tgl@sss.pgh.pa.us> said something like: > > Cool ... but it's not immediately obvious which of these changes did the > trick for you. What settings were you at before? And what's the > details of the problem query? > > The first three settings you mention all seem like reasonable choices, > but I'd be hesitant to recommend 64M sort_mem for general use (it won't > take very many concurrent sorts to drive you into the ground...). So > I'm interested to narrow down exactly what was the issue here. > > regards, tom lane shared_buffers was 1024, now 8192 max_fsm_relations was 1000, now 10000 max_fsm_pages was 20000, now 100000 wal_buffers was 8, now 16 sort_mem was 1024, now 64000 vacuum_mem was 1024, now 64000 effective_cache_size was 1000, now 100000 I am in the process of reloading the dB, but obs_v and obs_i contain ~750000 records each. I'd be happy to play around withthe settings if you would like to see the timing results. I'll also be able to get some explain analyze results tomorrowwhen finished reloading. Suggestions as to what values to change first? There is a 'C' language trigger on the obs_v and obs_i tables which essentially combines the data from the the obs_? tablesand updates the catalog table when the obs_? records are updated. The query is: UPDATE obs_v SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag), use = true FROM color_group AS cg, zero_pair AS zp, obs_i AS i, files AS f WHERE obs_v.star_id = i.star_id AND obs_v.file_id = f.file_id AND cg.group_id = f.group_id AND f.group_id = $group_id AND zp.pair_id = f.pair_id which is called from a perl script (DBD::Pg - which sets $group_id), and the relevant tables are: Table "public.obs_v" Column | Type | Modifiers ---------+---------+------------------------------------------------ x | real | not null y | real | not null imag | real | not null smag | real | not null ra | real | not null dec | real | not null obs_id | integer | not null default nextval('"obs_id_seq"'::text) file_id | integer | use | boolean | default false solve | boolean | default false star_id | integer | mag | real | Indexes: obs_v_file_id_index btree (file_id), obs_v_loc_index btree (ra, "dec"), obs_v_obs_id_index btree (obs_id), obs_v_star_id_index btree (star_id), obs_v_use_index btree (use) Foreign Key constraints: obs_v_files_constraint FOREIGN KEY (file_id) REFERENCES files(file_id) ON UPDATE NO ACTION ON DELETECASCADE Triggers: obs_v_trig with obs_i being identical (inherited from same root table) Table "public.color_group" Column | Type | Modifiers ----------+---------+----------- group_id | integer | color_u | real | default 0 color_b | real | default 0 color_v | real | default 0 color_r | real | default 0 color_i | real | default 0 Indexes: color_group_group_id_index btree (group_id) Foreign Key constraints: $1 FOREIGN KEY (group_id) REFERENCES groups(group_id) ON UPDATE NO ACTION ON DELETE CASCADE Table "public.zero_pair" Column | Type | Modifiers ---------+---------+----------- pair_id | integer | not null zero_u | real | default 0 zero_b | real | default 0 zero_v | real | default 0 zero_r | real | default 0 zero_i | real | default 0 Indexes: zero_pair_pkey primary key btree (pair_id), zero_pair_pair_id_index btree (pair_id) Foreign Key constraints: $1 FOREIGN KEY (pair_id) REFERENCES pairs(pair_id) ON UPDATE NO ACTION ON DELETE CASCADE Table "public.files" Column | Type | Modifiers ----------+--------------------------+------------------------------------------------------- file_id | integer | not null default nextval('"files_file_id_seq"'::text) group_id | integer | pair_id | integer | date | timestamp with time zone | not null name | character varying | not null ra_min | real | default 0 ra_max | real | default 0 dec_min | real | default 0 dec_max | real | default 0 Indexes: files_pkey primary key btree (file_id), files_name_key unique btree (name), files_id_index btree (file_id, group_id, pair_id), files_range_index btree (ra_min, ra_max, dec_min, dec_max), imported__file_id_idex btree (file_id) Foreign Key constraints: $1 FOREIGN KEY (group_id) REFERENCES groups(group_id) ON UPDATE NO ACTION ON DELETE CASCADE, $2 FOREIGN KEY (pair_id) REFERENCES pairs(pair_id) ON UPDATE NO ACTION ON DELETE CASCADE Table "public.catalog" Column | Type | Modifiers ------------------+------------------+------------------------------------------------- star_id | integer | not null default nextval('"star_id_seq"'::text) loc_count | integer | default 0 ra | real | not null ra_sum | double precision | default 0 ra_sigma | real | default 0 ra_sum_square | double precision | default 0 dec | real | not null dec_sum | double precision | default 0 dec_sigma | real | default 0 dec_sum_square | double precision | default 0 mag_u_count | integer | default 0 mag_u | real | default 99 mag_u_sum | double precision | default 0 mag_u_sigma | real | default 0 mag_u_sum_square | double precision | default 0 mag_b_count | integer | default 0 mag_b | real | default 99 mag_b_sum | double precision | default 0 mag_b_sigma | real | default 0 mag_b_sum_square | double precision | default 0 mag_v_count | integer | default 0 mag_v | real | default 99 mag_v_sum | double precision | default 0 mag_v_sigma | real | default 0 mag_v_sum_square | double precision | default 0 mag_r_count | integer | default 0 mag_r | real | default 99 mag_r_sum | double precision | default 0 mag_r_sigma | real | default 0 mag_r_sum_square | double precision | default 0 mag_i_count | integer | default 0 mag_i | real | default 99 mag_i_sum | double precision | default 0 mag_i_sigma | real | default 0 mag_i_sum_square | double precision | default 0 Indexes: catalog_pkey primary key btree (star_id), catalog_ra_decl_index btree (ra, "dec"), catalog_star_id_index btree (star_id) -- O_
Robert Creager <Robert_Creager@LogicalChaos.org> writes: >> I'm interested to narrow down exactly what was the issue here. > shared_buffers was 1024, now 8192 > max_fsm_relations was 1000, now 10000 > max_fsm_pages was 20000, now 100000 > wal_buffers was 8, now 16 > sort_mem was 1024, now 64000 > vacuum_mem was 1024, now 64000 > effective_cache_size was 1000, now 100000 > The query is: > UPDATE obs_v > SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag), > use = true > FROM color_group AS cg, zero_pair AS zp, obs_i AS i, files AS f > WHERE obs_v.star_id = i.star_id > AND obs_v.file_id = f.file_id > AND cg.group_id = f.group_id > AND f.group_id = $group_id > AND zp.pair_id = f.pair_id Hm. My best guess is that the increase in sort_mem allowed this query to use a more efficient join plan. Perhaps the planner switched from merge to hash join once it thought the hash table would fit in sort_mem; or maybe the plan didn't change but the executor was able to keep everything in memory instead of using temp files. The other changes you mention seem good as general housekeeping, but I doubt they'd have much direct effect on this query's speed. It'd be interesting to look at EXPLAIN ANALYZE results for the same query at several different sort_mem values. regards, tom lane
> [szepe@pinerecords.com] > > > Peter Childs <blue.dragon@blueyonder.co.uk> writes: > > > On Fri, 30 May 2003, Tomas Szepe wrote: > > >> Trouble is, as the rows in the tables get deleted/inserted/updated > > >> (the frequency being a couple thousand rows per minute), the database > > >> is growing out of proportion in size. > > > > > Would more regular vacuum help. I think a vaccum every hour may do > > > the job. > > > > Also note that no amount of vacuuming will save you if the FSM is not > > large enough to keep track of all the free space. The default FSM > > settings, like all the other default settings in Postgres, are set up > > for a small installation. You'd probably need to raise them by at least > > a factor of 10 for this installation. > > Thanks, I'll try to tweak those settings and will let the list know how > things went. Well, raising max_fsm_pages to 500000 seems to have solved the problem entirely. My thanks go to everyone who've offered their help. -- Tomas Szepe <szepe@pinerecords.com>