Thread: table size growing out of control
I have a table of about 5000 records that I noticed was taking a very long time to do simple selects from. I looked at explain analyze and got the following: rms=# explain analyze select count(*) from health_exception_test; NOTICE: QUERY PLAN: Aggregate (cost=158497.22..158497.22 rows=1 width=0) (actual time=78087.35..78087.35 rows=1 loops=1) -> Seq Scan on health_exception_test (cost=0.00..158483.58 rows=5458 width=0) (actual time=78059.74..78082.31 rows=5458 loops=1) Total runtime: 78087.44 msec EXPLAIN rms=# explain analyze select count(*) from health_exception_test; NOTICE: QUERY PLAN: Aggregate (cost=158497.22..158497.22 rows=1 width=0) (actual time=80363.50..80363.50 rows=1 loops=1) -> Seq Scan on health_exception_test (cost=0.00..158483.58 rows=5458 width=0) (actual time=80335.86..80358.48 rows=5458 loops=1) Total runtime: 80363.59 msec EXPLAIN Since we do nightly vacuuming, I thought that there might be some index issues so I did the following query to get the size of the table: rms=# SELECT relname, relkind, relpages, relpages / 128 AS MB FROM pg_class WHERE relname LIKE 'health_%'; relname | relkind | relpages | mb -----------------------+---------+----------+------ health_exception_test | r | 158429 | 1237 health_ex_group | i | 20 | 0 (2 rows) health_ex_group is an index on 3 fields in the table. I have done a reindex on the table but that doesn't have much effect (which makes sense given the small index size). I also did a vacuum verbose analyze and got the following: rms=# VACUUM VERBOSE ANALYZE health_exception_test; NOTICE: --Relation health_exception_test-- NOTICE: Index health_ex_group: Pages 20; Tuples 5458: Deleted 0. CPU 0.00s/0.01u sec elapsed 0.00 sec. NOTICE: Removed 1397914 tuples in 14402 pages. CPU 1.67s/2.04u sec elapsed 22.90 sec. NOTICE: Index health_ex_group: Pages 20; Tuples 5458: Deleted 0. CPU 0.00s/0.00u sec elapsed 0.00 sec. NOTICE: Removed 271549 tuples in 2810 pages. CPU 0.32s/0.37u sec elapsed 2.89 sec. NOTICE: Pages 158429: Changed 0, Empty 0; Tup 5458: Vac 1669463, Keep 0, UnUsed 13717916. Total CPU 11.68s/3.44u sec elapsed 116.67 sec. NOTICE: --Relation pg_toast_9370044-- NOTICE: Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0. Total CPU 0.00s/0.00u sec elapsed 0.00 sec. NOTICE: Analyzing health_exception_test VACUUM and after checking the size of the table was no different. At this point we did a full drop/reload of the database and the table has now shrunk to the following: rms=# SELECT relname, relkind, relpages, relpages / 128 AS MB FROM pg_class WHERE relname LIKE 'health_%'; relname | relkind | relpages | mb -----------------------+---------+----------+---- health_ex_group | i | 20 | 0 health_exception_test | r | 57 | 0 (2 rows) and we now get a much more pleasing: rms=# explain analyze select count(*) from health_exception_test; NOTICE: QUERY PLAN: Aggregate (cost=125.22..125.22 rows=1 width=0) (actual time=13.15..13.15 rows=1 loops=1) -> Seq Scan on health_exception_test (cost=0.00..111.58 rows=5458 width=0) (actual time=0.01..8.18 rows=5458 loops=1) Total runtime: 13.21 msec EXPLAIN For the record, we went through this procedure about 2 weeks ago (slow queries, reindex, vacuum, drop/reload) So I am wondering what might be causing the table to grow so large. We run a function against the table about every 5 minutes which updates on average maybe 100 rows and adds rows at the rate of maybe 1 an hour, but otherwise everything else is selects. I wouldn't think that continual updates would have such a adverse effect on table size, and even if so shouldn't vacuum take care of this? Robert Treat
On 15 Jul 2002, Robert Treat wrote: > I have a table of about 5000 records that I noticed was taking a very > long time to do simple selects from. I looked at explain analyze and got Did you check the actual physical file sizes of your indexes and tables on the disk? One might have been growing but not shoing it in the tables you looked at. Did you have stalled transactions holding a lock on anything? That's all I can think of.
On Mon, Jul 15, 2002 at 04:20:10PM -0400, Robert Treat wrote: > I have a table of about 5000 records that I noticed was taking a very > long time to do simple selects from. I looked at explain analyze and got > the following: > > NOTICE: Pages 158429: Changed 0, Empty 0; Tup 5458: Vac 1669463, Keep > 0, UnUsed 13717916. That's a lot of unused tuples. Perhaps you need to do a VACUUM FULL. See the docs. Though I guess it's a little bit late to test now. Old uncommitted transactions could also be the culprit. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
Robert Treat <rtreat@webmd.net> writes: > For the record, we went through this procedure about 2 weeks ago (slow > queries, reindex, vacuum, drop/reload) So I am wondering what might be > causing the table to grow so large. We run a function against the table > about every 5 minutes which updates on average maybe 100 rows and adds > rows at the rate of maybe 1 an hour, but otherwise everything else is > selects. I wouldn't think that continual updates would have such a > adverse effect on table size, and even if so shouldn't vacuum take care > of this? You can do VACUUM FULL if you want to re-shrink the table. If you want to stick with plain VACUUMs then you need to do them often enough to keep the table size reasonable. You didn't say what your maintenance schedule is... If your overall database is large then you might need to increase the size of the free space map (see postgresql.conf). regards, tom lane
> If your overall database is large then you might need to increase the > size of the free space map (see postgresql.conf). I haven't seen any good documentation on what exactly this does, how the settings affect performance, or recommended values. Does this information exist anywhere? How would one go about optimizing these values? Greg
"Gregory Wood" <gregw@com-stock.com> writes: >> If your overall database is large then you might need to increase the >> size of the free space map (see postgresql.conf). > I haven't seen any good documentation on what exactly this does, how the > settings affect performance, or recommended values. > Does this information exist anywhere? Nope. Feel free to run some experiments and create some recommendations ;-) regards, tom lane
On Tue, 2002-07-16 at 01:38, Tom Lane wrote: > Robert Treat <rtreat@webmd.net> writes: > > For the record, we went through this procedure about 2 weeks ago (slow > > queries, reindex, vacuum, drop/reload) So I am wondering what might be > > causing the table to grow so large. We run a function against the table > > about every 5 minutes which updates on average maybe 100 rows and adds > > rows at the rate of maybe 1 an hour, but otherwise everything else is > > selects. I wouldn't think that continual updates would have such a > > adverse effect on table size, and even if so shouldn't vacuum take care > > of this? > > You can do VACUUM FULL if you want to re-shrink the table. If you want > to stick with plain VACUUMs then you need to do them often enough to > keep the table size reasonable. You didn't say what your maintenance > schedule is... Currently we do a nightly vacuum analyze on the entire database, and once a week we reindex the table. I suppose that I could increase the frequency of those vacuums but vacuum itself doesn't seem to be enough anyway. One thing I picked out from the archives is that vacuum cannot recover disk space if it cannot obtain an exclusive lock on the table. If this is still the case (someone confirm this and I'll add a note to the docs) it might explain part of my problem since that table is almost continually being updated. I gathered some more statistics that might be of interest: sizes from pg_class after the drop/reload of db: relname | relkind | relpages | mb -----------------------+---------+----------+---- health_ex_group | i | 20 | 0 health_exception_test | r | 57 | 0 sizes this morning after about 15 hours of use: relname | relkind | relpages | mb -----------------------+---------+----------+----- health_ex_group | i | 6975 | 54 health_exception_test | r | 17053 | 133 as you can see, things have already started to grow. I decided to run a reindex on the table, and now it shows: relname | relkind | relpages | mb -----------------------+---------+----------+----- health_ex_group | i | 21 | 0 health_exception_test | r | 24839 | 194 which gives me a significant reduction in my index size, but seems to have actually increased the table size by a large margin as well. Is this to be considered the norm? i then ran vacuum analyze on the table which gives me sizes of: relname | relkind | relpages | mb -----------------------+---------+----------+----- health_ex_group | i | 686 | 5 health_exception_test | r | 26331 | 205 still no real benefits. At this point I decided to run vacuum full and got the following: rms=# vacuum full analyze verbose health_exception_test; NOTICE: --Relation health_exception_test-- NOTICE: Pages 26331: Changed 176, reaped 26274, Empty 0, New 0; Tup 5593: Vac 570052, Keep/VTL 0/0, UnUsed 1982957, MinLen 54, MaxLen 78; Re-using: Free/Avail. Space 204496076/203607088; EndEmpty/Avail. Pages 114/26161. CPU 1.25s/0.19u sec elapsed 1.43 sec. NOTICE: Index health_ex_group: Pages 2511; Tuples 5593: Deleted 570052. CPU 0.33s/2.70u sec elapsed 11.08 sec. NOTICE: Rel health_exception_test: Pages: 26331 --> 58; Tuple(s) moved: 5593. CPU 3.39s/3.38u sec elapsed 37.76 sec. NOTICE: Index health_ex_group: Pages 2519; Tuples 5593: Deleted 5593. CPU 0.15s/0.04u sec elapsed 1.40 sec. NOTICE: --Relation pg_toast_11914691-- NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec elapsed 0.00 sec. NOTICE: Index pg_toast_11914691_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec elapsed 0.00 sec. NOTICE: Analyzing health_exception_test VACUUM this is what I want to see, all of the unused tuples being reclaimed. checking my sizes again: relname | relkind | relpages | mb -----------------------+---------+----------+---- health_ex_group | i | 2519 | 19 health_exception_test | r | 58 | 0 ok, the table seems back in check now, but we still have a slight issue on the index, but that gets solved by a quick run of reindex: relname | relkind | relpages | mb -----------------------+---------+----------+---- health_ex_group | i | 21 | 0 health_exception_test | r | 58 | 0 and now I am back where I belong. I guess my next step now becomes making a vacuum full & reindex of that table part of my everyday maintenance. I can do it, but have to admit it seems excessive imho. The other thing I guess might be to rethink how we are doing updates on that table, to see if we can optimize it more. > > If your overall database is large then you might need to increase the > size of the free space map (see postgresql.conf). > Well, it is large and I do think we need to increase the fsm, is there any documentation as to the effects of changing it? Robert Treat
On Tue, Jul 16, 2002 at 03:45:08PM -0400, Robert Treat wrote: > anyway. One thing I picked out from the archives is that vacuum cannot > recover disk space if it cannot obtain an exclusive lock on the table. > If this is still the case (someone confirm this and I'll add a note to > the docs) it might explain part of my problem since that table is almost > continually being updated. I gathered some more statistics that might be > of interest: But as I understand it, the standard, non-blocking vacuum marks unused pages for reuse by the backend. That approach can only "remember" so many recovered pages. Adjusting the free space map setting improves that, so if you have a lot of turnover in your tables, you can increase the FSM and vacuum more frequently. You still need up to double the size of the table, however, to accommodate the turnover. Now, I _think_ the above is correct, and I hope someone will correct me if I'm wrong. One question I have, however, is what the performance penalty is of having a more-or-less constant vacuum process running. Historically, of course, one had to trade off vacuuming against the cost of an exclusive table lock. But now that vacuum doesn't block everyone else, is there some reason not to run vacuum (say) hourly (aside, obviously, from load on the machine). A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Andrew Sullivan <andrew@libertyrms.info> writes: > But as I understand it, the standard, non-blocking vacuum marks > unused pages for reuse by the backend. That approach can only > "remember" so many recovered pages. Adjusting the free space map > setting improves that, so if you have a lot of turnover in your > tables, you can increase the FSM and vacuum more frequently. You > still need up to double the size of the table, however, to > accommodate the turnover. Only if your vacuum schedule is to vacuum once per 100% turnover of the table contents. If you vacuum as often as, say, 10% of the table rows are updated or deleted, then you should see the table size remaining at about 10% over the minimum possible size. So it's a straight tradeoff of CPU expenditure versus disk space. Tables that get a lot of update activity need to be vacuumed often to keep them from bloating. Or at least that's the theory. It can fall down if your FSM size is too small to let all the free space be tracked. We've also seen some reports since 7.2 release of tables growing when it didn't appear that they should, but I'm unconvinced yet whether those cases were PG bugs or application problems (eg, old open transactions preventing VACUUM from reclaiming space). regards, tom lane
On Tue, 2002-07-16 at 19:18, Tom Lane wrote: > Andrew Sullivan <andrew@libertyrms.info> writes: > > But as I understand it, the standard, non-blocking vacuum marks > > unused pages for reuse by the backend. That approach can only > > "remember" so many recovered pages. Adjusting the free space map > > setting improves that, so if you have a lot of turnover in your > > tables, you can increase the FSM and vacuum more frequently. You > > still need up to double the size of the table, however, to > > accommodate the turnover. > > Only if your vacuum schedule is to vacuum once per 100% turnover of > the table contents. If you vacuum as often as, say, 10% of the table > rows are updated or deleted, then you should see the table size > remaining at about 10% over the minimum possible size. So it's a > straight tradeoff of CPU expenditure versus disk space. Tables that > get a lot of update activity need to be vacuumed often to keep them > from bloating. > > Or at least that's the theory. It can fall down if your FSM size > is too small to let all the free space be tracked. We've also seen > some reports since 7.2 release of tables growing when it didn't appear > that they should, but I'm unconvinced yet whether those cases were > PG bugs or application problems (eg, old open transactions preventing > VACUUM from reclaiming space). > Is there a way to verify the existence of the old open transactions? Robert Treat
On Tue, Jul 16, 2002 at 07:37:56PM -0400, Robert Treat wrote: > Is there a way to verify the existence of the old open transactions? select * from pg_stat_activity; or ps aux | grep postgres Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On Tue, 2002-07-16 at 19:18, Tom Lane wrote: > Andrew Sullivan <andrew@libertyrms.info> writes: > > But as I understand it, the standard, non-blocking vacuum marks > > unused pages for reuse by the backend. That approach can only > > "remember" so many recovered pages. Adjusting the free space map > > setting improves that, so if you have a lot of turnover in your > > tables, you can increase the FSM and vacuum more frequently. You > > still need up to double the size of the table, however, to > > accommodate the turnover. > > Only if your vacuum schedule is to vacuum once per 100% turnover of > the table contents. If you vacuum as often as, say, 10% of the table > rows are updated or deleted, then you should see the table size > remaining at about 10% over the minimum possible size. So it's a > straight tradeoff of CPU expenditure versus disk space. Tables that > get a lot of update activity need to be vacuumed often to keep them > from bloating. > > Or at least that's the theory. It can fall down if your FSM size > is too small to let all the free space be tracked. We've also seen > some reports since 7.2 release of tables growing when it didn't appear > that they should, but I'm unconvinced yet whether those cases were > PG bugs or application problems (eg, old open transactions preventing > VACUUM from reclaiming space). > > > Is there a way to verify the existence of the old open transactions? > > select * from pg_stat_activity; > > or > > ps aux | grep postgres those are the ways I had been thinking, here are some results: rms=# select * from pg_stat_activity; datid | datname | procpid | usesysid | usename | current_query ----------+---------+---------+----------+----------+--------------- 11914305 | rms | 2355 | 1 | postgres | 11914305 | rms | 29985 | 1 | postgres | 11914305 | rms | 4586 | 1 | postgres | 11914305 | rms | 1999 | 1 | postgres | 11914305 | rms | 5290 | 1 | postgres | 11914305 | rms | 21195 | 1 | postgres | 11914305 | rms | 5179 | 1 | postgres | 11914305 | rms | 2755 | 1 | postgres | 11914305 | rms | 4708 | 1 | postgres | 11914305 | rms | 2334 | 1 | postgres | 11914305 | rms | 21196 | 1 | postgres | 11914305 | rms | 2759 | 1 | postgres | 11914305 | rms | 2690 | 1 | postgres | 11914305 | rms | 21203 | 1 | postgres | 11914305 | rms | 26541 | 1 | postgres | 11914305 | rms | 21202 | 1 | postgres | 11914305 | rms | 5280 | 1 | postgres | 11914305 | rms | 21205 | 1 | postgres | 11914305 | rms | 5047 | 1 | postgres | 11914305 | rms | 4864 | 1 | postgres | 11914305 | rms | 5291 | 1 | postgres | 11914305 | rms | 5250 | 1 | postgres | (22 rows) ** with current_query blank I'm assuming we have nothing hanging around that would interfere right?. ** rms=# SELECT relname, relkind, relpages, relpages / 128 AS MB FROM pg_class WHERE relname LIKE 'health_%'; relname | relkind | relpages | mb -----------------------+---------+----------+----- health_ex_group | i | 7433 | 58 health_exception_test | r | 18165 | 141 rms=# vacuum analyze verbose health_exception_test; NOTICE: --Relation health_exception_test-- NOTICE: Index health_ex_group: Pages 9698; Tuples 5715: Deleted 639447. CPU 0.72s/2.97u sec elapsed 12.20 sec. NOTICE: Removed 639447 tuples in 6585 pages. CPU 0.91s/0.70u sec elapsed 8.04 sec. NOTICE: Pages 24749: Changed 178, Empty 0; Tup 5715: Vac 639447, Keep 0, UnUsed 1758908. Total CPU 2.79s/3.80u sec elapsed 21.53 sec. NOTICE: --Relation pg_toast_11914691-- NOTICE: Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0. Total CPU 0.00s/0.00u sec elapsed 0.00 sec. NOTICE: Analyzing health_exception_test VACUUM rms=# rms=# SELECT relname, relkind, relpages, relpages / 128 AS MB FROM pg_class WHERE relname LIKE 'health_%'; relname | relkind | relpages | mb -----------------------+---------+----------+----- health_ex_group | i | 9698 | 75 health_exception_test | r | 24749 | 193 rms=# select * from pg_stat_activity; datid | datname | procpid | usesysid | usename | current_query ----------+---------+---------+----------+----------+--------------- 11914305 | rms | 2355 | 1 | postgres | 11914305 | rms | 29985 | 1 | postgres | 11914305 | rms | 4586 | 1 | postgres | 11914305 | rms | 1999 | 1 | postgres | 11914305 | rms | 5290 | 1 | postgres | 11914305 | rms | 21195 | 1 | postgres | 11914305 | rms | 5179 | 1 | postgres | 11914305 | rms | 2755 | 1 | postgres | 11914305 | rms | 4708 | 1 | postgres | 11914305 | rms | 2334 | 1 | postgres | 11914305 | rms | 21196 | 1 | postgres | 11914305 | rms | 2759 | 1 | postgres | 11914305 | rms | 2690 | 1 | postgres | 11914305 | rms | 21203 | 1 | postgres | 11914305 | rms | 26541 | 1 | postgres | 11914305 | rms | 21202 | 1 | postgres | 11914305 | rms | 5280 | 1 | postgres | 11914305 | rms | 21205 | 1 | postgres | 11914305 | rms | 5047 | 1 | postgres | 11914305 | rms | 4864 | 1 | postgres | 11914305 | rms | 5291 | 1 | postgres | 11914305 | rms | 5250 | 1 | postgres | (22 rows) Correct me if I'm wrong, but this seems to indicate that my table is growing, vacuum is not reusing space, and there are no stale transactions lying around. Now for the record I am still using the default FSM settings, I guess I need to either up this *a lot* or start vacuuming every 5 minutes? Robert Treat
On Wed, Jul 17, 2002 at 09:45:12AM -0400, Robert Treat wrote: > ** with current_query blank I'm assuming we have nothing hanging around > that would interfere right?. ** Not necessarily. It depends on what your STATS_COMMAND_STRING is. By default it's turned off. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Robert Treat wrote: > > those are the ways I had been thinking, here are some results: > > rms=# select * from pg_stat_activity; > datid | datname | procpid | usesysid | usename | current_query > ----------+---------+---------+----------+----------+--------------- > 11914305 | rms | 2355 | 1 | postgres | > 11914305 | rms | 29985 | 1 | postgres | > > [...] > > ** with current_query blank I'm assuming we have nothing hanging around > that would interfere right?. ** Assuming that you are doing it as a database superuser, wrong. It means that your have "stats_command_string = false" or commented out in your postgresql.conf file. Otherwise it would report "<IDLE>" or "<IDLE> in transaction". Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wed, Jul 17, 2002 at 09:45:12AM -0400, Robert Treat wrote: > ** with current_query blank I'm assuming we have nothing hanging around > that would interfere right?. ** No, that could also indicate that either you're not the super-user, or you haven't enabled stats_command_string in postgresql.conf That probably isn't the best UI... Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On 16 Jul 2002 15:45:08 -0400, Robert Treat <rtreat@webmd.net> wrote: >> > We run a function against the table >> > about every 5 minutes which updates on average maybe 100 rows and adds >> > rows at the rate of maybe 1 an hour, Robert, are you sure about those 100 updated rows/5 minutes? >sizes from pg_class after the drop/reload of db: > relname | relkind | relpages | mb >-----------------------+---------+----------+---- > health_ex_group | i | 20 | 0 > health_exception_test | r | 57 | 0 I think I saw you mention that there are 5500 rows. So you have approx. 100 rows/page. >sizes this morning after about 15 hours of use: > > relname | relkind | relpages | mb >-----------------------+---------+----------+----- > health_ex_group | i | 6975 | 54 > health_exception_test | r | 17053 | 133 This reflects the numbers at the time of your vacuum. >as you can see, things have already started to grow. I decided to run a >reindex on the table, and now it shows: > > relname | relkind | relpages | mb >-----------------------+---------+----------+----- > health_ex_group | i | 21 | 0 > health_exception_test | r | 24839 | 194 > >which gives me a significant reduction in my index size, but seems to >have actually increased the table size by a large margin as well. Is >this to be considered the norm? It did not increase the table size, it did update pg_class with current numbers. >i then ran vacuum analyze on the table which gives me sizes of: > > relname | relkind | relpages | mb >-----------------------+---------+----------+----- > health_ex_group | i | 686 | 5 > health_exception_test | r | 26331 | 205 So in the time between reindex and vacuum your table has grown by 1500 pages or (estimated) 150000 tuples. That's 30 times the number of rows, or - in other words - at a rate of 20 rows/minute this growth would be expected in 100 days. Now I may be wrong, but ISTM there is a process (or more) running that does a *lot* of updates. Can you tell us something about the function that is supposed to update 100 rows every five minutes? Is anything else doing updates you were not aware of at first sight? Note for example, that UPDATE table1 SET col1=col1; doesn't look like changing anything, but it writes a new version of every row to the database. Servus Manfred
nconway@klamath.dyndns.org (Neil Conway) writes: > On Wed, Jul 17, 2002 at 09:45:12AM -0400, Robert Treat wrote: >> ** with current_query blank I'm assuming we have nothing hanging around >> that would interfere right?. ** > No, that could also indicate that either you're not the super-user, or you > haven't enabled stats_command_string in postgresql.conf > That probably isn't the best UI... Good point. What do people think of making the pg_stat_activity view do COALESCE(current_query, '<unknown>') so that there's a clear flag that you're not seeing anything? I'm not wedded to the particular string <unknown>, if someone has a better idea. regards, tom lane
Tom Lane wrote: > nconway@klamath.dyndns.org (Neil Conway) writes: > > On Wed, Jul 17, 2002 at 09:45:12AM -0400, Robert Treat wrote: > >> ** with current_query blank I'm assuming we have nothing hanging around > >> that would interfere right?. ** > > > No, that could also indicate that either you're not the super-user, or you > > haven't enabled stats_command_string in postgresql.conf > > > That probably isn't the best UI... > > Good point. What do people think of making the pg_stat_activity view do > COALESCE(current_query, '<unknown>') > so that there's a clear flag that you're not seeing anything? > > I'm not wedded to the particular string <unknown>, if someone has a > better idea. Yes, we really should report something on all the stats tables when they aren't enabled. Would safe a lot of headaches. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Wed, 2002-07-17 at 10:31, Jan Wieck wrote: > Robert Treat wrote: > > > > those are the ways I had been thinking, here are some results: > > > > rms=# select * from pg_stat_activity; > > datid | datname | procpid | usesysid | usename | current_query > > ----------+---------+---------+----------+----------+--------------- > > 11914305 | rms | 2355 | 1 | postgres | > > 11914305 | rms | 29985 | 1 | postgres | > > > > [...] > > > > ** with current_query blank I'm assuming we have nothing hanging around > > that would interfere right?. ** > > Assuming that you are doing it as a database superuser, wrong. It means > that your have "stats_command_string = false" or commented out in your > postgresql.conf file. Otherwise it would report "<IDLE>" or "<IDLE> in > transaction". > ok. but my ps aux | grep postgres did label all connections as idle, which should be equivalent, right? Robert Treat
Robert Treat <rtreat@webmd.net> writes: > ok. but my ps aux | grep postgres did label all connections as idle, > which should be equivalent, right? If it said "idle" and not "idle in transaction" then yes ... but make sure your ps isn't truncating the string such that you can't see the "in transaction" part. regards, tom lane
On Wed, 2002-07-17 at 11:14, Manfred Koizar wrote: > On 16 Jul 2002 15:45:08 -0400, Robert Treat <rtreat@webmd.net> wrote: > >> > We run a function against the table > >> > about every 5 minutes which updates on average maybe 100 rows and adds > >> > rows at the rate of maybe 1 an hour, > > Robert, are you sure about those 100 updated rows/5 minutes? > Yesterday I started thinking this as well and after much digging and swearing that there was no way I was updating more than 300/5 minutes, I have found an update statement in one of the functions that contains no where clause. I need to do some more digging, but if I'm right this means I am updating all 5500 rows every 5 minutes. Actually I am probably updating more but I think at least 5500 rows are getting updated! > > > > relname | relkind | relpages | mb > >-----------------------+---------+----------+----- > > health_ex_group | i | 21 | 0 > > health_exception_test | r | 24839 | 194 > > > > >i then ran vacuum analyze on the table which gives me sizes of: > > > > relname | relkind | relpages | mb > >-----------------------+---------+----------+----- > > health_ex_group | i | 686 | 5 > > health_exception_test | r | 26331 | 205 > > So in the time between reindex and vacuum your table has grown by 1500 > pages or (estimated) 150000 tuples. That's 30 times the number of > rows, or - in other words - at a rate of 20 rows/minute this growth > would be expected in 100 days. > > Now I may be wrong, but ISTM there is a process (or more) running that > does a *lot* of updates. Can you tell us something about the function > that is supposed to update 100 rows every five minutes? Is anything > else doing updates you were not aware of at first sight? > Everything else falls into place with that many updates. Clearly my FSM would be too small to remember all of that, so my vacuums had little chance of being effective. Temporarily I should be able to add a vacuum every 5 minutes along with the function call to keep things from getting out of hand until the function is fixed. Thanks to everyone else who helped out on this, hopefully this thread will prove of use to some other folks. Robert Treat
Tom Lane wrote: > > nconway@klamath.dyndns.org (Neil Conway) writes: > > On Wed, Jul 17, 2002 at 09:45:12AM -0400, Robert Treat wrote: > >> ** with current_query blank I'm assuming we have nothing hanging around > >> that would interfere right?. ** > > > No, that could also indicate that either you're not the super-user, or you > > haven't enabled stats_command_string in postgresql.conf > > > That probably isn't the best UI... > > Good point. What do people think of making the pg_stat_activity view do > COALESCE(current_query, '<unknown>') > so that there's a clear flag that you're not seeing anything? Shouldn't be done on the view. The builtin function used to extract that information from the stats file can do it much better. If memory serves it's also the one who decides to return NULL if you're not a superuser. So it could return <permission denied> or similar in that case to distinguish. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck <JanWieck@Yahoo.com> writes: > Tom Lane wrote: >> Good point. What do people think of making the pg_stat_activity view do >> COALESCE(current_query, '<unknown>') >> so that there's a clear flag that you're not seeing anything? > Shouldn't be done on the view. The builtin function used to extract that > information from the stats file can do it much better. No, I think the view is exactly the right place to do it. If we do it in the function then we are forcing a UI-prettiness issue onto applications that may not want it; but they'll have no way to bypass it. > So it could return <permission denied> or similar in that case to > distinguish. I don't think it's all that important to distinguish the reason why you're not seeing it. Returning NULL from the function is a perfectly good convention at the level of the function. regards, tom lane
Tom Lane wrote: > > Jan Wieck <JanWieck@Yahoo.com> writes: > > Tom Lane wrote: > >> Good point. What do people think of making the pg_stat_activity view do > >> COALESCE(current_query, '<unknown>') > >> so that there's a clear flag that you're not seeing anything? > > > Shouldn't be done on the view. The builtin function used to extract that > > information from the stats file can do it much better. > > No, I think the view is exactly the right place to do it. If we do it > in the function then we are forcing a UI-prettiness issue onto > applications that may not want it; but they'll have no way to bypass it. > > > So it could return <permission denied> or similar in that case to > > distinguish. > > I don't think it's all that important to distinguish the reason why > you're not seeing it. Returning NULL from the function is a perfectly > good convention at the level of the function. Could you then please explain why you want to change anything at all? Doesn't NULL perfectly express "unknown", at least if it's in the result set of an ANSI compliant database? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck <JanWieck@Yahoo.com> writes: > Could you then please explain why you want to change anything at all? > Doesn't NULL perfectly express "unknown", at least if it's in the result > set of an ANSI compliant database? Well, I'm not dead set on changing the view rather than the function. Anyone else have an opinion? regards, tom lane
Tom Lane wrote: > Jan Wieck <JanWieck@Yahoo.com> writes: > > Tom Lane wrote: > >> Good point. What do people think of making the pg_stat_activity view do > >> COALESCE(current_query, '<unknown>') > >> so that there's a clear flag that you're not seeing anything? > > > Shouldn't be done on the view. The builtin function used to extract that > > information from the stats file can do it much better. > > No, I think the view is exactly the right place to do it. If we do it > in the function then we are forcing a UI-prettiness issue onto > applications that may not want it; but they'll have no way to bypass it. > > > So it could return <permission denied> or similar in that case to > > distinguish. > > I don't think it's all that important to distinguish the reason why > you're not seeing it. Returning NULL from the function is a perfectly > good convention at the level of the function. I think you do need to distinguish "permission denied" from "not enabled". I think our "not enabled" status for this feature is confusing enough for you have to report is separately. You could return "permission denied or not enabled", I guess. In fact, I would put the text in parens, then apps can know that strings in quotes are errors, but of course, if they issue "(select * from pg_class)" there are problems. Maybe "-- permission denied". -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I would put the text in parens, then apps can know that strings in > quotes are errors, but of course, if they issue "(select * from > pg_class)" there are problems. Maybe "-- permission denied". What's wrong with <message>, like we already have there for <idle>? regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I would put the text in parens, then apps can know that strings in > > quotes are errors, but of course, if they issue "(select * from > > pg_class)" there are problems. Maybe "-- permission denied". > > What's wrong with <message>, like we already have there for <idle>? Oh, interesting. I guess I saw <idle> as a valid status, while permission/enable was more of an error so should have a different format. I could go either way. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026