Thread: lots of updates on small table
Hi, Our application requires a number of processes to select and update rows from a very small (<10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, but the updates keep getting slower and slower until the processes are restarted. Restarting the processes isn't really a viable option in our 24/7 production environment, so we're trying to figure out what's causing the slow updates. The environment is as follows: Red Hat 9, kernel 2.4.20-8 PostgreSQL 7.3.2 ecpg 2.10.0 The processes are all compiled C programs accessing the database using ECPG. Does anyone have any thoughts on what might be happening here? Thanks Alison
On Thu, Jul 14, 2005 at 03:08:30PM +1000, Alison Winters wrote: > Hi, > > Our application requires a number of processes to select and update rows > from a very small (<10 rows) Postgres table on a regular and frequent > basis. These processes often run for weeks at a time, but over the > space of a few days we find that updates start getting painfully slow. > We are running a full vacuum/analyze and reindex on the table every day, Full vacuum, eh? I wonder if what you really need is very frequent non-full vacuum. Say, once in 15 minutes (exact rate depending on dead tuple rate.) -- Alvaro Herrera (<alvherre[a]alvh.no-ip.org>) "World domination is proceeding according to plan" (Andrew Morton)
On Thu, 2005-07-14 at 15:08 +1000, Alison Winters wrote: > Hi, > > Our application requires a number of processes to select and update rows > from a very small (<10 rows) Postgres table on a regular and frequent > basis. These processes often run for weeks at a time, but over the Are these long running transactions or is the process issuing many short transactions? If your transaction lasts a week, then a daily vacuum isn't really doing anything. I presume you also run ANALYZE in some shape or form periodically? > space of a few days we find that updates start getting painfully slow. > We are running a full vacuum/analyze and reindex on the table every day, If they're short transactions, run vacuum (not vacuum full) every 100 or so updates. This might even be once a minute. Analyze periodically as well. --
Hi, > > Our application requires a number of processes to select and update rows > > from a very small (<10 rows) Postgres table on a regular and frequent > > basis. These processes often run for weeks at a time, but over the > > space of a few days we find that updates start getting painfully slow. > > We are running a full vacuum/analyze and reindex on the table every day, > Full vacuum, eh? I wonder if what you really need is very frequent > non-full vacuum. Say, once in 15 minutes (exact rate depending on dead > tuple rate.) > Is there a difference between vacuum and vacuum full? Currently we have a cron job going every hour that does: VACUUM FULL VERBOSE ANALYZE plc_fldio REINDEX TABLE plc_fldio The most recent output was this: INFO: --Relation public.plc_fldio-- INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. CPU 0.04s/0.14u sec elapsed 0.18 sec. INFO: Index plcpage_idx: Pages 315; Tuples 108137: Deleted 4176. CPU 0.03s/0.04u sec elapsed 0.14 sec. INFO: Rel plc_fldio: Pages: 1221 --> 1221; Tuple(s) moved: 0. CPU 0.03s/0.04u sec elapsed 0.36 sec. INFO: Analyzing public.plc_fldio VACUUM REINDEX We'll up it to every 15 minutes, but i don't know if that'll help because even with the current vacuuming the updates are still getting slower and slower over the course of several days. What really puzzles me is why restarting the processes fixes it. Does PostgreSQL keep some kind of backlog of transactions all for one database connection? Isn't it normal to have processes that keep a single database connection open for days at a time? Regarding the question another poster asked: all the transactions are very short. The table is essentially a database replacement for a shared memory segment - it contains a few rows of byte values that are constantly updated byte-at-a-time to communicate data between different industrial control processes. Thanks for the thoughts everyone, Alison
Alison Winters wrote: > Hi, > > >>>Our application requires a number of processes to select and update rows >>>from a very small (<10 rows) Postgres table on a regular and frequent >>>basis. These processes often run for weeks at a time, but over the >>>space of a few days we find that updates start getting painfully slow. >>>We are running a full vacuum/analyze and reindex on the table every day, >> >>Full vacuum, eh? I wonder if what you really need is very frequent >>non-full vacuum. Say, once in 15 minutes (exact rate depending on dead >>tuple rate.) >> > > Is there a difference between vacuum and vacuum full? Currently we have > a cron job going every hour that does: > > VACUUM FULL VERBOSE ANALYZE plc_fldio > REINDEX TABLE plc_fldio VACUUM FULL exclusively locks the table (so that nothing else can happen) and the compacts it as much as it can. You almost definitely want to only VACUUM every 15min, maybe VACUUM FULL 1/day. VACUUM FULL is more for when you haven't been VACUUMing often enough. Or have major changes to your table. Basically VACUUM marks rows as empty and available for reuse, VACUUM FULL removes empty space (but requires a full lock, because it is moving rows around). If anything, I would estimate that VACUUM FULL would be hurting your performance. But it may happen fast enough not to matter. > > The most recent output was this: > > INFO: --Relation public.plc_fldio-- > INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. > CPU 0.04s/0.14u sec elapsed 0.18 sec. > INFO: Index plcpage_idx: Pages 315; Tuples 108137: Deleted 4176. > CPU 0.03s/0.04u sec elapsed 0.14 sec. > INFO: Rel plc_fldio: Pages: 1221 --> 1221; Tuple(s) moved: 0. > CPU 0.03s/0.04u sec elapsed 0.36 sec. > INFO: Analyzing public.plc_fldio > VACUUM > REINDEX > > We'll up it to every 15 minutes, but i don't know if that'll help > because even with the current vacuuming the updates are still getting > slower and slower over the course of several days. What really puzzles > me is why restarting the processes fixes it. Does PostgreSQL keep some > kind of backlog of transactions all for one database connection? Isn't > it normal to have processes that keep a single database connection open > for days at a time? I believe that certain locks are grabbed per session. Or at least there is some command that you can run, which you don't want to run in a maintained connection. (It might be VACUUM FULL, I don't remember which one it is). But the fact that your application works at all seems to be that it isn't acquiring any locks. I know VACUUM cannot clean up any rows that are visible in one of the transactions, I don't know if this includes active connections or not. > > Regarding the question another poster asked: all the transactions are > very short. The table is essentially a database replacement for a > shared memory segment - it contains a few rows of byte values that are > constantly updated byte-at-a-time to communicate data between different > industrial control processes. > > Thanks for the thoughts everyone, > > Alison > Is it possible to have some sort of timer that would recognize it has been connected for too long, drop the database connection, and reconnect? I don't know that it would solve anything, but it would be something you could try. John =:->
Attachment
alison@mirrabooka.com (Alison Winters) writes: >>> Our application requires a number of processes to select and update rows >>> from a very small (<10 rows) Postgres table on a regular and frequent >>> basis. These processes often run for weeks at a time, but over the >>> space of a few days we find that updates start getting painfully slow. No wonder, considering that your "less than 10 rows" table contains something upwards of 100000 tuples: > INFO: --Relation public.plc_fldio-- > INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. > CPU 0.04s/0.14u sec elapsed 0.18 sec. What you need to do is find out why VACUUM is unable to reclaim all those dead row versions. The reason is likely that some process is sitting on a open transaction for days at a time. > Isn't it normal to have processes that keep a single database > connection open for days at a time? Database connection, sure. Single transaction, no. > Regarding the question another poster asked: all the transactions are > very short. Somewhere you have one that isn't. Try watching the backends with ps, or look at the pg_stat_activity view if your version of PG has it, to see which sessions are staying "idle in transaction" indefinitely. regards, tom lane
On Fri, Jul 15, 2005 at 09:42:12AM +1000, Alison Winters wrote: > > > Our application requires a number of processes to select and update rows > > > from a very small (<10 rows) Postgres table on a regular and frequent > > > basis. These processes often run for weeks at a time, but over the > > > space of a few days we find that updates start getting painfully slow. > > > We are running a full vacuum/analyze and reindex on the table every day, > > Full vacuum, eh? I wonder if what you really need is very frequent > > non-full vacuum. Say, once in 15 minutes (exact rate depending on dead > > tuple rate.) > > > Is there a difference between vacuum and vacuum full? Yes. Vacuum full is more aggresive in compacting the table. Though it really works the same in the presence of long-running transactions: tuples just can't be removed. > The most recent output was this: > > INFO: --Relation public.plc_fldio-- > INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. > CPU 0.04s/0.14u sec elapsed 0.18 sec. > INFO: Index plcpage_idx: Pages 315; Tuples 108137: Deleted 4176. > CPU 0.03s/0.04u sec elapsed 0.14 sec. > INFO: Rel plc_fldio: Pages: 1221 --> 1221; Tuple(s) moved: 0. > CPU 0.03s/0.04u sec elapsed 0.36 sec. > INFO: Analyzing public.plc_fldio Hmm, so it seems your hourly vacuum is enough. I think the bloat theory can be trashed. Unless I'm reading this output wrong; I don't remember the details of this vacuum output. > We'll up it to every 15 minutes, but i don't know if that'll help > because even with the current vacuuming the updates are still getting > slower and slower over the course of several days. What really puzzles > me is why restarting the processes fixes it. I wonder if the problem may be plan caching. I didn't pay full attention to the description of your problem, so I don't remember if it could be an issue, but it's something to consider. > Does PostgreSQL keep some kind of backlog of transactions all for one > database connection? No. There could be a problem if you had very long transactions, but apparently this isn't your problem. > Isn't it normal to have processes that keep a single database > connection open for days at a time? I guess it depends on exactly what you do with it. I know of at least one case where an app keeps a connection open for months, without a problem. (It's been running for four or five years, and monthly "uptime" for that particular daemon is not unheard of.) -- Alvaro Herrera (<alvherre[a]alvh.no-ip.org>) "Everybody understands Mickey Mouse. Few understand Hermann Hesse. Hardly anybody understands Einstein. And nobody understands Emperor Norton."
On Thu, Jul 14, 2005 at 08:28:24PM -0400, Alvaro Herrera wrote: > On Fri, Jul 15, 2005 at 09:42:12AM +1000, Alison Winters wrote: > > > INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. > > Hmm, so it seems your hourly vacuum is enough. I think the bloat theory > can be trashed. Unless I'm reading this output wrong; I don't remember > the details of this vacuum output. Ok, so I was _very_ wrong :-) Sorry. -- Alvaro Herrera (<alvherre[a]alvh.no-ip.org>) Essentially, you're proposing Kevlar shoes as a solution for the problem that you want to walk around carrying a loaded gun aimed at your foot. (Tom Lane)
Hi all, > No wonder, considering that your "less than 10 rows" table contains > something upwards of 100000 tuples: > > > INFO: --Relation public.plc_fldio-- > > INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. > > CPU 0.04s/0.14u sec elapsed 0.18 sec. > > What you need to do is find out why VACUUM is unable to reclaim all > those dead row versions. The reason is likely that some process is > sitting on a open transaction for days at a time. > Cheers mate, that was one of our theories but we weren't sure if it'd be worth rebuilding everything to check. We've been compiling without the -t (autocommit) flag to ecpg, and i believe what's happening is sometimes a transaction is begun and then the processes cycle around doing hardware i/o and never commit or only commit way too late. What we're going to try now is remove all the begins and commits from the code and compile with -t to make sure that any updates happen immediately. Hopefully that'll avoid any hanging transactions. We'll also set up a 10-minutely vacuum (not full) as per some other suggestions here. I'll let you know how it goes - we'll probably slot everything in on Monday so we have a week to follow it. Thanks everyone Alison
Just to dig up an old thread from last month: In case anyone was wondering we finally got a free day to put in the new version of the software, and it's greatly improved the performance. The solutions we employed were as follows: - recompile everything with ecpg -t for auto-commit - vacuum run by cron every 15 minutes - vacuum full analyze AND a reindex of the table in question run by cron once per day I also went through and double-checked everywhere where we opened a cursor to make sure it was always closed when we were done. This seems to have fixed up the endlessly growing indexes and slower and slower updates. Thanks to everyone who helped out. Alison