Thread: Rapidly decaying performance repopulating a large table
I have a fairly simple table (a dozen real/integer columns, a few indexes, one foreign key reference) with ~120m rows. Periodically the table is truncated or dropped and recreated and the data is regenerated (slightly different data, of course, or the exercise would be rather pointless). The regeneration occurs in batches of ~4000 data points at a time, which are inserted into the table via COPY, and are coming from several simultaneous processes. The first several million data points are quite quick (the copy executes in well under a quarter second). By the time the table reaches 10-15m rows, however, each individual COPY is taking >20 seconds to execute. Is there anything I can do to improve this performance? I can't drop/recreate the indices because some of the data points rely on points generated already in the run, and dropping the indices would make the required joins ridiculously expensive once the table starts growing. The foreign key reference *is* droppable for this regeneration, but I wouldn't expect it to be a performance problem. The autovacuum daemon is running in the background, with these settings: (All autovacuum-specific settings are still at defaults) vacuum_cost_delay = 50 # 0-1000 milliseconds vacuum_cost_page_hit = 1 # 0-10000 credits vacuum_cost_page_miss = 10 # 0-10000 credits vacuum_cost_page_dirty = 20 # 0-10000 credits vacuum_cost_limit = 200 # 1-10000 credits My gut feeling is that better autovacuum settings would help, but I'm not really sure what to modify to get the improvement I'm looking for. -- - David T. Wilson david.t.wilson@gmail.com
On Tue, Apr 22, 2008 at 2:31 PM, David Wilson <david.t.wilson@gmail.com> wrote: > I have a fairly simple table (a dozen real/integer columns, a few > indexes, one foreign key reference) with ~120m rows. Periodically the > table is truncated or dropped and recreated and the data is > regenerated (slightly different data, of course, or the exercise would > be rather pointless). The regeneration occurs in batches of ~4000 data > points at a time, which are inserted into the table via COPY, and are > coming from several simultaneous processes. > > The first several million data points are quite quick (the copy > executes in well under a quarter second). By the time the table > reaches 10-15m rows, however, each individual COPY is taking >20 > seconds to execute. Is there anything I can do to improve this > performance? I can't drop/recreate the indices because some of the The best bet is to issue an "analyze table" (with your table name in there, of course) and see if that helps. Quite often the real issue is that pgsql is using a method to insert rows when you have 10million of them that made perfect sense when you had 100 rows, but no longer is the best way.
On Tue, Apr 22, 2008 at 4:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > The best bet is to issue an "analyze table" (with your table name in > there, of course) and see if that helps. Quite often the real issue > is that pgsql is using a method to insert rows when you have 10million > of them that made perfect sense when you had 100 rows, but no longer > is the best way. > This has caused the behavior to be... erratic. That is, individual copies are now taking anywhere from 2 seconds (great!) to 30+ seconds (back where we were before). I also clearly can't ANALYZE the table after every 4k batch; even if that resulted in 2 second copies, the analyze would take up as much time as the copy otherwise would have been. I could conceivably analyze after every ~80k (the next larger unit of batching; I'd love to be able to batch the copies at that level but dependencies ensure that I can't), but it seems odd to have to analyze so often. Oh, barring COPY delays I'm generating the data at a rate of something like a half million rows every few minutes, if that's relevant. -- - David T. Wilson david.t.wilson@gmail.com
On Tue, Apr 22, 2008 at 2:59 PM, David Wilson <david.t.wilson@gmail.com> wrote: > On Tue, Apr 22, 2008 at 4:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > > The best bet is to issue an "analyze table" (with your table name in > > there, of course) and see if that helps. Quite often the real issue > > is that pgsql is using a method to insert rows when you have 10million > > of them that made perfect sense when you had 100 rows, but no longer > > is the best way. > > > > This has caused the behavior to be... erratic. That is, individual > copies are now taking anywhere from 2 seconds (great!) to 30+ seconds > (back where we were before). I also clearly can't ANALYZE the table > after every 4k batch; even if that resulted in 2 second copies, the > analyze would take up as much time as the copy otherwise would have > been. I could conceivably analyze after every ~80k (the next larger > unit of batching; I'd love to be able to batch the copies at that > level but dependencies ensure that I can't), but it seems odd to have > to analyze so often. Normally, after the first 50,000 or so the plan won't likely change due to a new analyze, so you could probably just analyze after 50k or so and get the same performance. If the problem is a bad plan for the inserts / copies. also, non-indexed foreign keyed fields can cause this problem.
On Tue, Apr 22, 2008 at 5:04 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > Normally, after the first 50,000 or so the plan won't likely change > due to a new analyze, so you could probably just analyze after 50k or > so and get the same performance. If the problem is a bad plan for the > inserts / copies. > > also, non-indexed foreign keyed fields can cause this problem. > Analyzing after the first 50k or so is easy enough, then; thanks for the suggestion. Foreign keys are definitely indexed (actually referencing a set of columns that the foreign table is UNIQUE on). Any other suggestions? COPY times alone are pretty much quadrupling my table-rebuild runtime, and I can interrupt the current rebuild to try things pretty much at a whim (nothing else uses the DB while a rebuild is happening), so I'm pretty much game to try any reasonable suggestions anyone has. -- - David T. Wilson david.t.wilson@gmail.com
On Tue, Apr 22, 2008 at 3:15 PM, David Wilson <david.t.wilson@gmail.com> wrote: > On Tue, Apr 22, 2008 at 5:04 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > > Normally, after the first 50,000 or so the plan won't likely change > > due to a new analyze, so you could probably just analyze after 50k or > > so and get the same performance. If the problem is a bad plan for the > > inserts / copies. > > > > also, non-indexed foreign keyed fields can cause this problem. > > > > Analyzing after the first 50k or so is easy enough, then; thanks for > the suggestion. > > Foreign keys are definitely indexed (actually referencing a set of > columns that the foreign table is UNIQUE on). > > Any other suggestions? COPY times alone are pretty much quadrupling my > table-rebuild runtime, and I can interrupt the current rebuild to try > things pretty much at a whim (nothing else uses the DB while a rebuild > is happening), so I'm pretty much game to try any reasonable > suggestions anyone has. Try upping your checkpoint segments. Some folks find fairly large numbers like 50 to 100 to be helpful. Each segment = 16Megs, so be sure not to run your system out of drive space while increasing it.
On Tue, Apr 22, 2008 at 5:18 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > > Try upping your checkpoint segments. Some folks find fairly large > numbers like 50 to 100 to be helpful. Each segment = 16Megs, so be > sure not to run your system out of drive space while increasing it. > Ahh, much more progress. Upping the segments to 50, timeout to 30m and completion target to 0.9 has improved average copy time to between 2 and 10 seconds, which is definitely an improvement. Thanks for the help. Any other random thoughts while you're at it? :) -- - David T. Wilson david.t.wilson@gmail.com
On Apr 22, 2008, at 4:46 PM, David Wilson wrote: > On Tue, Apr 22, 2008 at 5:18 PM, Scott Marlowe <scott.marlowe@gmail.com > > wrote: >> >> Try upping your checkpoint segments. Some folks find fairly large >> numbers like 50 to 100 to be helpful. Each segment = 16Megs, so be >> sure not to run your system out of drive space while increasing it. >> > > Ahh, much more progress. Upping the segments to 50, timeout to 30m and > completion target to 0.9 has improved average copy time to between 2 > and 10 seconds, which is definitely an improvement. Thanks for the > help. Any other random thoughts while you're at it? :) Has anyone yet pointed out the standards: drop indexes and foreign keys and rebuild them once the entire data import is finished? Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
"David Wilson" <david.t.wilson@gmail.com> writes: > Foreign keys are definitely indexed (actually referencing a set of > columns that the foreign table is UNIQUE on). Are you loading any tables that are the targets of foreign key references from other tables being loaded? If so, I'd bet on Scott's theory being correct with respect to the plans for checks of those FK constraints. The COPY itself hasn't got any plan, and inserting rows into a table should be constant-time in itself, so it seems to me there are only two possibilities for a progressive slowdown: * the cost of updating the indexes, which for standard btree indexes ought to grow at about log(N) when there are already N entries * bad plans in either foreign-key triggers or user-defined triggers attached to the tables. You failed to mention what PG version this is (tut tut) but if it's less than 8.3 then ANALYZE alone won't fix bad plans in triggers; you'd need to analyze and then start a fresh database session. You said that you don't need to have the FK constraints present, so I'd strongly suggest trying it without ... regards, tom lane
On Tue, Apr 22, 2008 at 6:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "David Wilson" <david.t.wilson@gmail.com> writes: > > Are you loading any tables that are the targets of foreign key > references from other tables being loaded? If so, I'd bet on > Scott's theory being correct with respect to the plans for checks > of those FK constraints. The COPY itself hasn't got any plan, > and inserting rows into a table should be constant-time in itself, > so it seems to me there are only two possibilities for a progressive > slowdown: > > * the cost of updating the indexes, which for standard btree indexes > ought to grow at about log(N) when there are already N entries > > * bad plans in either foreign-key triggers or user-defined triggers > attached to the tables. Only one table is being regenerated, and it's not the target of any foreign key checks itself; it merely has a single FK reference out to one unchanging table. There are no triggers on the table. > You failed to mention what PG version this is (tut tut) but if it's > less than 8.3 then ANALYZE alone won't fix bad plans in triggers; > you'd need to analyze and then start a fresh database session. PG is 8.3.1. I certainly expect some slowdown, given that I have indices that I can't drop (as you indicate above). Having been watching it now for a bit, I believe that the checkpoint settings were the major cause of the problem, however. Changing those settings has dropped the copy times back down toward what I'd expect; I have also now dropped the FK constraint, but that has made no perceptible difference in time. My guess at this point is that I'm just running into index update times and checkpoint IO. The only thing that still seems strange is the highly variable nature of the COPY times- anywhere from <1.0 seconds to >20 seconds, with an average probably around 8ish. I can live with that, but I'm still open to any other suggestions anyone has! Thanks for the help so far. -- - David T. Wilson david.t.wilson@gmail.com
"David Wilson" <david.t.wilson@gmail.com> writes: > My guess at this point is that I'm just running into index update > times and checkpoint IO. The only thing that still seems strange is > the highly variable nature of the COPY times- anywhere from <1.0 > seconds to >20 seconds, with an average probably around 8ish. I can > live with that, but I'm still open to any other suggestions anyone > has! What have you got shared_buffers set to? If it's not enough to cover the working set for your indexes, that might be the (other) problem. regards, tom lane
On Tue, Apr 22, 2008 at 7:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > What have you got shared_buffers set to? If it's not enough to cover > the working set for your indexes, that might be the (other) problem. > shared_buffers = 1536MB Is there a way to get the size of a specific index, on that note? There seem to be access functions for the relation + indices, and for the relation by itself, but not a specific index out of possibly several. I could increase shared_buffers some, but client apps on the same machine occasionally also have hefty memory requirements (not during these regeneration runs, but it seems like restarting the server with a new shared_buffers value before and after the regeneration is a bit of overkill). -- - David T. Wilson david.t.wilson@gmail.com
On Tue, 22 Apr 2008, David Wilson wrote: > My guess at this point is that I'm just running into index update > times and checkpoint IO. The only thing that still seems strange is > the highly variable nature of the COPY times- anywhere from <1.0 > seconds to >20 seconds, with an average probably around 8ish. Have you turned on log_checkpoints to see whether those are correlated with the slow ones? Given that you've had an improvement by increasing checkpoint_segments, it's not out of the question to think that maybe you're still getting nailed sometimes during the more stressful portions of the checkpoint cycle (usually right near the end). The 1 second ones just might just happen to be ones that start just as the previous checkpoint finished. To make lining those up easier, you might turn on logging of long statements with log_min_duration_statement to see both bits of data in the same log file. That might get you some other accidental enlightenment as well (like if there's some other statement going on that's colliding with this load badly). This is a bit out of my area, but after reading the rest of this thread I wonder whether raising the default_statistics_target parameter a bit might reduce the instances of bad plans showing up. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Tue, 22 Apr 2008, David Wilson wrote: > Is there a way to get the size of a specific index, on that note? select pg_size_pretty(pg_relation_size('index_name')) works for me. There's a neat article on other things you can look at like this at http://andreas.scherbaum.la/blog/archives/282-table-size,-database-size.html -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith <gsmith@gregsmith.com> writes: > ...This is a bit out of my area, but after reading the rest of this thread I > wonder whether raising the default_statistics_target parameter a bit might > reduce the instances of bad plans showing up. On the evidence so far, it doesn't seem that David's problem has anything to do with bad plans --- I suspected that at first, but the theory's been shot down. I'm now thinking it's an I/O bottleneck in some form ... regards, tom lane
On Tue, 2008-04-22 at 18:46 -0400, David Wilson wrote: > I certainly expect some slowdown, given that I have indices that I > can't drop (as you indicate above). Having been watching it now for a > bit, I believe that the checkpoint settings were the major cause of > the problem, however. Changing those settings has dropped the copy > times back down toward what I'd expect; I have also now dropped the FK > constraint, but that has made no perceptible difference in time. > > My guess at this point is that I'm just running into index update > times and checkpoint IO. The only thing that still seems strange is > the highly variable nature of the COPY times- anywhere from <1.0 > seconds to >20 seconds, with an average probably around 8ish. I can > live with that, but I'm still open to any other suggestions anyone > has! I think it would be good to see some graphs of this. The drop in speed can be explained by growing index size. The variability in performance can be explained by variations in the data distribution of the indexed column, i.e. the I/O isn't actually random in the statistical sense. The speed of the COPY probably depends mostly on how many infrequently occurring values you have in each set of loaded data. However, that thinking could mask other problems. Try log_statement_stats=on to see if the I/O is increasing per call and that the I/O is correlated to the performance. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On 2008-04-22 23:46, David Wilson wrote: > Upping the segments to 50, timeout to 30m and completion target to > 0.9 has improved average copy time to between 2 and 10 seconds, which > is definitely an improvement. I'd up them to 128 (or even 256) and set completion target back to 0.5. But make sure you'll always have 4GB (8GB) of disk space for wal logs. All you'd risk is several minutes of recovering in case of server crash. Regards Tometzky -- ...although Eating Honey was a very good thing to do, there was a moment just before you began to eat it which was better than when you were... Winnie the Pooh
Thanks for the help from everyone on this. Further investigation with the suggested statistics and correlating that with some IO graphs pretty much nailed the problem down to checkpoint IO holding things up, and tuning the checkpoint segments and completion target (128 and 0.9 seemed to be the best tradeoff for me) pretty much cleared things up. All told, the run time of this regeneration pass was less than half of what it was the last time I ran one, despite involving more total data. Much appreciated. -- - David T. Wilson david.t.wilson@gmail.com