Thread: database bloat,non removovable rows, slow query etc...
Hi, probably this is a very frequenfly question... I read archivies of this list but I didn't found a finally solution for this aspect. I'll explain my situation. PSQL version 8.1.3 configuration of fsm,etcc default autovacuum and statistics activated 22 daemons that have a persistent connection to this database(all connection are in "idle"(no transaction opened). this is the vacuum output of a table that it's updated frequently: database=# VACUUM ANALYZE verbose cliente; INFO: vacuuming "public.cliente" INFO: index "cliente_pkey" now contains 29931 row versions in 88 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "cliente_login_key" now contains 29931 row versions in 165 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.00 sec. INFO: "cliente": found 0 removable, 29931 nonremovable row versions in 559 pages DETAIL: 29398 dead row versions cannot be removed yet. There were 9 unused item pointers. 0 pages are entirely empty. CPU 0.01s/0.01u sec elapsed 0.01 sec. INFO: vacuuming "pg_toast.pg_toast_370357" INFO: index "pg_toast_370357_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_toast_370357": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing "public.cliente" INFO: "cliente": scanned 559 of 559 pages, containing 533 live rows and 29398 dead rows; 533 rows in sample, 533 estimatedtotal rows VACUUM database=# SELECT * from pgstattuple('cliente'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space| free_percent -----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+-------------- 4579328 | 533 | 84522 | 1.85 | 29398 | 4279592 | 93.45 | 41852| 0.91 (1 row) The performance of this table it's degraded now and autovacuum/vacuum full don't remove these dead tuples. Only if I do a CLUSTER of the table the tuples are removed. The same problem is on other very trafficated tables. I think that the problems probably are: - tune the value of my fsm/etc settings in postgresql.conf but i don't understdand how to tune it correctly. - the persistent connections to this db conflict with the autovacuum but i don't understand why. there are no transaction opened, only connections in "idle" state. Tell me what do you think... Regards, Matteo
Matteo Sgalaberni <sgala@sgala.com> writes: > 22 daemons that have a persistent connection to this database(all > connection are in "idle"(no transaction opened). You may think that, but you are wrong. > INFO: "cliente": found 0 removable, 29931 nonremovable row versions in 559 pages > DETAIL: 29398 dead row versions cannot be removed yet. The only way the above can happen is if there are some fairly old open transactions. Looking in pg_stat_activity might help you identify the culprit(s). regards, tom lane
Are there open transactions on the table in question? We had the same issue. A 100K row table was so bloated that the system thought there was 1M rows. We had many <IDLE> transaction that we noticed in TOP, but since we could not track down which process or user was holding the table we had to restart Pg. Once restarted we were able to do a VACUUM FULL and this took care of the issue. hth Patrick Hatcher Development Manager Analytics/MIO Macys.com Matteo Sgalaberni <sgala@sgala.com> Sent by: To pgsql-performance pgsql-performance@postgresql.org -owner@postgresql cc .org Subject [PERFORM] database bloat,non 09/01/06 05:39 AM removovable rows, slow query etc... Hi, probably this is a very frequenfly question... I read archivies of this list but I didn't found a finally solution for this aspect. I'll explain my situation. PSQL version 8.1.3 configuration of fsm,etcc default autovacuum and statistics activated 22 daemons that have a persistent connection to this database(all connection are in "idle"(no transaction opened). this is the vacuum output of a table that it's updated frequently: database=# VACUUM ANALYZE verbose cliente; INFO: vacuuming "public.cliente" INFO: index "cliente_pkey" now contains 29931 row versions in 88 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "cliente_login_key" now contains 29931 row versions in 165 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.00 sec. INFO: "cliente": found 0 removable, 29931 nonremovable row versions in 559 pages DETAIL: 29398 dead row versions cannot be removed yet. There were 9 unused item pointers. 0 pages are entirely empty. CPU 0.01s/0.01u sec elapsed 0.01 sec. INFO: vacuuming "pg_toast.pg_toast_370357" INFO: index "pg_toast_370357_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_toast_370357": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing "public.cliente" INFO: "cliente": scanned 559 of 559 pages, containing 533 live rows and 29398 dead rows; 533 rows in sample, 533 estimated total rows VACUUM database=# SELECT * from pgstattuple('cliente'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent -----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+-------------- 4579328 | 533 | 84522 | 1.85 | 29398 | 4279592 | 93.45 | 41852 | 0.91 (1 row) The performance of this table it's degraded now and autovacuum/vacuum full don't remove these dead tuples. Only if I do a CLUSTER of the table the tuples are removed. The same problem is on other very trafficated tables. I think that the problems probably are: - tune the value of my fsm/etc settings in postgresql.conf but i don't understdand how to tune it correctly. - the persistent connections to this db conflict with the autovacuum but i don't understand why. there are no transaction opened, only connections in "idle" state. Tell me what do you think... Regards, Matteo ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Hi, Tom and Matteo, Tom Lane wrote: > Matteo Sgalaberni <sgala@sgala.com> writes: >> 22 daemons that have a persistent connection to this database(all >> connection are in "idle"(no transaction opened). > > You may think that, but you are wrong. > >> INFO: "cliente": found 0 removable, 29931 nonremovable row versions in 559 pages >> DETAIL: 29398 dead row versions cannot be removed yet. > > The only way the above can happen is if there are some fairly old open > transactions. Looking in pg_stat_activity might help you identify the > culprit(s). Another possibility might be an outstanding two-phase-commit transaction. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
On Fri, Sep 01, 2006 at 10:43:30AM -0400, Tom Lane wrote: > Matteo Sgalaberni <sgala@sgala.com> writes: > > 22 daemons that have a persistent connection to this database(all > > connection are in "idle"(no transaction opened). > > You may think that, but you are wrong. Ok. I stopped all clients. No connections to this database. Only psql console. Made vacuum full/freeze all cominations... again dead rows non removable. Nothing changed as in production. this is my postgres config: http://pastebin.com/781480 I read a lot about bloat tables related to not appropriate fsm settings... can be the mine a case of misconfiguration of these parameters? Thx Matteo
Matteo Sgalaberni <sgala@sgala.com> writes: > Ok. I stopped all clients. No connections to this database. When you say "this database", do you mean the whole postmaster cluster, or just the one database? Open transactions in other databases of the same cluster can be a problem. regards, tom lane
On Fri, 2006-09-01 at 12:28, Matteo Sgalaberni wrote: > On Fri, Sep 01, 2006 at 10:43:30AM -0400, Tom Lane wrote: > > Matteo Sgalaberni <sgala@sgala.com> writes: > > > 22 daemons that have a persistent connection to this database(all > > > connection are in "idle"(no transaction opened). > > > > You may think that, but you are wrong. > Ok. I stopped all clients. No connections to this database. Only psql > console. Made vacuum > full/freeze all cominations... again dead rows non removable. Nothing > changed as in production. > > this is my postgres config: > > http://pastebin.com/781480 > > I read a lot about bloat tables related to > not appropriate fsm settings... can be the mine a case of > misconfiguration of these parameters? Something is holding a lock, somewhere. Have you tried shutting down and restarting the database to see if you can get it to vacuum that way? You're not in a transaction in psql, right?
On Fri, Sep 01, 2006 at 01:35:20PM -0400, Tom Lane wrote: > Matteo Sgalaberni <sgala@sgala.com> writes: > > Ok. I stopped all clients. No connections to this database. > > When you say "this database", do you mean the whole postmaster cluster, > or just the one database? Open transactions in other databases of the > same cluster can be a problem. > AGH!!!! AGHR!!! A my collegue JDBC application that stay in "idle intransaction" 24h/24h (but in another database, non in the bloated-reported db...)! I killed it now(jdbc app). vacuumed full and PG have cleaned all!! So if I have a idle transaction in one database of the cluster it "lock" vacuums of all databases of the cluster. Good to know this...but why this behaviour? it'is lovely...:) Tom , can you explain why?... Thanks a lot!! Matteo
Matteo, On 2-Sep-06, at 4:37 AM, Matteo Sgalaberni wrote: > On Fri, Sep 01, 2006 at 01:35:20PM -0400, Tom Lane wrote: >> Matteo Sgalaberni <sgala@sgala.com> writes: >>> Ok. I stopped all clients. No connections to this database. >> >> When you say "this database", do you mean the whole postmaster >> cluster, >> or just the one database? Open transactions in other databases of >> the >> same cluster can be a problem. >> > AGH!!!! AGHR!!! > > A my collegue JDBC application that stay in "idle intransaction" > 24h/24h > (but in another database, non in the bloated-reported db...)! > > I killed it now(jdbc app). this behaviour has been fixed in later versions of the jdbc driver > > vacuumed full and PG have cleaned all!! So if I have a idle > transaction in > one database of the cluster it "lock" vacuums of all databases of > the cluster. > > Good to know this...but why this behaviour? it'is lovely...:) > > Tom , can you explain why?... > > Thanks a lot!! > > Matteo > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
Matteo Sgalaberni <sgala@sgala.com> writes: > Good to know this...but why this behaviour? it'is lovely...:) Open transactions are tracked across the whole cluster. This is necessary when vacuuming shared catalogs. In principle we could track per-database xmin values as well, but the distributed overhead that'd be added to *every* GetSnapshotData call is a bit worrisome. regards, tom lane
Hi, Matteo, Matteo Sgalaberni wrote: > A my collegue JDBC application that stay in "idle intransaction" 24h/24h Just a little note: For most applications, this can be fixed updating the JDBC driver. Old versions had the behaviour of auto-opening a new backend transaction on commit/rollback, whereas new versions delay that until the first statement in the new transaction is sent. This won't fix applications that do a select and then sit idle for days before committing/rolling back, however. Those should be fixed or use autocommit mode. > Good to know this...but why this behaviour? it'is lovely...:) > > Tom , can you explain why?... It is because the transaction IDs are global per cluster. Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Tom Lane <tgl@sss.pgh.pa.us> writes: > Matteo Sgalaberni <sgala@sgala.com> writes: > > Good to know this...but why this behaviour? it'is lovely...:) > > Open transactions are tracked across the whole cluster. This is > necessary when vacuuming shared catalogs. In principle we could > track per-database xmin values as well, but the distributed overhead > that'd be added to *every* GetSnapshotData call is a bit worrisome. Don't we do that now in CVS (ie, in 8.2)? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > Matteo Sgalaberni <sgala@sgala.com> writes: > > > Good to know this...but why this behaviour? it'is lovely...:) > > > > Open transactions are tracked across the whole cluster. This is > > necessary when vacuuming shared catalogs. In principle we could > > track per-database xmin values as well, but the distributed overhead > > that'd be added to *every* GetSnapshotData call is a bit worrisome. > > Don't we do that now in CVS (ie, in 8.2)? No, we don't. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Gregory Stark wrote: > > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > > > Matteo Sgalaberni <sgala@sgala.com> writes: > > > > Good to know this...but why this behaviour? it'is lovely...:) > > > > > > Open transactions are tracked across the whole cluster. This is > > > necessary when vacuuming shared catalogs. In principle we could > > > track per-database xmin values as well, but the distributed overhead > > > that'd be added to *every* GetSnapshotData call is a bit worrisome. > > > > Don't we do that now in CVS (ie, in 8.2)? > > No, we don't. I must be misunderstanding Tom's comment then. What I'm referring to is lazy_vacuum_rel() calls vacuum_set_xid_limits with the relisshared flag of the relation. vacuum_set_xid_limits passes that to GetOldestXmin as the allDbs parameter. GetOldestXmin ignores transactions not connected to the same database unless allDbs is true. -- greg
Gregory Stark <gsstark@mit.edu> writes: > I must be misunderstanding Tom's comment then. > What I'm referring to is lazy_vacuum_rel() calls vacuum_set_xid_limits with > the relisshared flag of the relation. vacuum_set_xid_limits passes that to > GetOldestXmin as the allDbs parameter. GetOldestXmin ignores transactions not > connected to the same database unless allDbs is true. The problem is the indirect effect of other backends' xmin values, which are computed across all live backends. In the current structure, it's hard to see how to fix this except by making each backend compute and advertise both a global and database-local xmin. This seems a bit ugly. Also, someone asked recently whether we could avoid counting prepared xacts when figuring vacuum cutoffs, which seems a fair question --- but again, how to do that without doubling the number of advertised xmin values yet again? I'm starting to feel that we've reached the limits of this system of accounting for live XIDs, but I have no idea what the next step might look like... regards, tom lane