Thread: Slow insert performace, 8.3 Wal related?
Hi Group. Recently upgraded from 8.1 to 8.3 on RHEL 5 64-bit. I've noticed some performance problems that I am guessing are WAL related based on my browsing around and wondered if someone had some suggestions for tuning the WAL settings. It could also help if someone just laughed at me and told me it wasn't WAL. I have narrowed the problem down to two pretty simple descriptions. I had a data load that I was doing with 8.1. It involved about 250k sql statements that were inserts into a table with just one index. The index has two fields. With the upgrade to 8.3 that process started taking all night and 1/2 a day. It inserted at the rate of 349 records a minute. When I started working on the problem I decided to test by putting all statements withing a single transaction. Just a simple BEGIN at the start and COMMIT at the end. Magically it only took 7 minutes to do the whole set, or 40k per minute. That seemed very odd to me, but at least I solved the problem. The most recently noticed simple problem. I had a table with about 20k records. We issued the statement DELETE FROM table where this=that. This was part of a combined index and about 8k records should have been deleted. This statement caused all other queries to grind to a halt. It was only when I killed it that normal operation resumed. It was acting like a lock, but that table was not being used by any other process. So that describes what I am seeing, let me relay what we are doing with what I think to be the relevant settings. For the log shipping, I am using scp to send the logs to a separate server. And yes they are getting sent. I have it set now to send the log about every two minutes since I am comfortable with that amount of data loss. Here are the settings from the log file that are not commented out relating to WAL. (everything below WRITE AHEAD LOG section in the default config file) synchronous_commit = off checkpoint_segments = 3 # in logfile segments, min 1, 16MB each checkpoint_timeout = 5min # range 30s-1h checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 checkpoint_warning = 30s # 0 is off archive_mode = on # allows archiving to be done archive_command = '/var/lib/pgsql/data/logship.sh %f %p' archive_timeout = 120 # force a logfile segment switch after this Thanks for any help (or laughter) Rusty
Bill Preston <billpreston@crownepointe.net> writes: > I've noticed some performance problems that I am guessing are WAL > related based on my browsing around and wondered if someone had some > suggestions for tuning the WAL settings. It could also help if someone > just laughed at me and told me it wasn't WAL. Consider it done ;-). I'm not sure what your problem is but it's unlikely to be WAL, especially not if you're using the same WAL-related settings in 8.1 and 8.3. Which you might not be. The large speedup from wrapping many small inserts into one transaction is entirely expected and should have occurred on 8.1 as well. I am suspicious that you were running 8.1 with fsync off and 8.3 has it on. Do you still have your 8.1 postgresql.conf? Comparing all the non-defaulted settings would be the first thing to do. If it's not that, I'm not sure. One cross-version difference that comes to mind is that 8.3 is a bit stickier about implicit casting, and so it seems conceivable that something about these queries was considered indexable in 8.1 and is not in 8.3. But you've not provided enough detail to do more than speculate. regards, tom lane
On Monday 12 January 2009, Bill Preston <billpreston@crownepointe.net> wrote: > I had a data load that I was doing with 8.1. It involved about 250k sql > statements that were inserts into a table with just one index. The index > has two fields. > With the upgrade to 8.3 that process started taking all night and 1/2 a > day. It inserted at the rate of 349 records a minute. > When I started working on the problem I decided to test by putting all > statements withing a single transaction. Just a simple BEGIN at the > start and COMMIT at the end. Magically it only took 7 minutes to do the > whole set, or 40k per minute. That seemed very odd to me, but at least I > solved the problem. > That's well-known behaviour. If you don't do them in one big transaction, PostgreSQL has to fsync after every insert, which effectively limits your insert rate to the rotational speed of your WAL drive (roughly speaking). If you don't explicitly start and end transactions, PostgreSQL does it for you. For every statement. > The most recently noticed simple problem. > I had a table with about 20k records. We issued the statement DELETE > FROM table where this=that. > This was part of a combined index and about 8k records should have been > deleted. > This statement caused all other queries to grind to a halt. It was only > when I killed it that normal operation resumed. It was acting like a > lock, but that table was not being used by any other process. Are there foreign keys on any other table(s) that point to this one? Are the relevant columns in those tables indexed? -- Current Peeve: The mindset that the Internet is some sort of school for novice sysadmins and that everyone -not- doing stupid dangerous things should act like patient teachers with the ones who are. -- Bill Cole, NANAE
Wow thanks for all the help Tom and Alan. Sadly I was un-aware of the well-known behavior. Consider it more will known now. Fsync is off in 8.3. I am not too worried about what was before in 8.1 since we are not going back. For the first example (bad behavior when I am not using transactions). Is there anyway to tell that it is going on at a given point and time? Is their a buffer that fills up, a stat somewhere that I can read? A lot of our code isn't using transactions yet so I would like a heads up when this problem is happening or if possible increase some parameter so it happens less. As to the second example with the delete. There are no foreign keys. For the index. If the table has fields a,b,c and d. We have a btree index (a,b,c,d) and we are saying DELETE FROM table_messed_up WHERE a=x. So the WHERE statement is the first field in the the index. Now that you have given me more knowledge, let me ask a question that might lead to the answer. Example 1 happens in isolation. Example 2 happened on a live system with the parameters that I specified and a whole lot of sql statements without transactions being run at the same time. In fact their probably was a whole lot of inserts on this very table before the delete statement was hit. Is it possible that a problem like Example 1 caused the behavior that I witnessed in Example 2? It was waiting for the WAL's to catch up or something? Thanks Rusty Alan Hodgson wrote: > On Monday 12 January 2009, Bill Preston <billpreston@crownepointe.net> > wrote: > >> I had a data load that I was doing with 8.1. It involved about 250k sql >> statements that were inserts into a table with just one index. The index >> has two fields. >> With the upgrade to 8.3 that process started taking all night and 1/2 a >> day. It inserted at the rate of 349 records a minute. >> When I started working on the problem I decided to test by putting all >> statements withing a single transaction. Just a simple BEGIN at the >> start and COMMIT at the end. Magically it only took 7 minutes to do the >> whole set, or 40k per minute. That seemed very odd to me, but at least I >> solved the problem. >> >> > > That's well-known behaviour. If you don't do them in one big transaction, > PostgreSQL has to fsync after every insert, which effectively limits your > insert rate to the rotational speed of your WAL drive (roughly speaking). > If you don't explicitly start and end transactions, PostgreSQL does it for > you. For every statement. > > >> The most recently noticed simple problem. >> I had a table with about 20k records. We issued the statement DELETE >> FROM table where this=that. >> This was part of a combined index and about 8k records should have been >> deleted. >> This statement caused all other queries to grind to a halt. It was only >> when I killed it that normal operation resumed. It was acting like a >> lock, but that table was not being used by any other process. >> > > Are there foreign keys on any other table(s) that point to this one? Are the > relevant columns in those tables indexed? > > >
On Monday 12 January 2009, Bill Preston <billpreston@crownepointe.net> wrote: > As to the second example with the delete. There are no foreign keys. > For the index. If the table has fields a,b,c and d. > We have a btree index (a,b,c,d) > and we are saying DELETE FROM table_messed_up WHERE a=x. > Is there anything special about this table? Does it have like a hundred indexes on it or something? Because deleting 8k rows from a normal table should never take more than a couple of seconds. -- Alan
Bill Preston wrote: > Fsync is off in 8.3. You should consider turning synchronous_commit off instead. That's almost as good as fsync=off performance-wise, but doesn't leave your database corrupt in case of power loss or OS crash. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Nothing special about that table. One index. It really seems that the system would grind to a stand-still when a lot of non-transaction inserts were run combined with the creation of some large temp tables. Since we added transactions and started using truncate, things have cleared up nicely. The suggestions here really helped. Does anyone know of some established postgresql consultants that can be hired for emergency analysis/tuning when things come up? Rusty Alan Hodgson wrote: > On Monday 12 January 2009, Bill Preston <billpreston@crownepointe.net> > wrote: > >> As to the second example with the delete. There are no foreign keys. >> For the index. If the table has fields a,b,c and d. >> We have a btree index (a,b,c,d) >> and we are saying DELETE FROM table_messed_up WHERE a=x. >> >> > > Is there anything special about this table? Does it have like a hundred > indexes on it or something? Because deleting 8k rows from a normal table > should never take more than a couple of seconds. > >
On Thu, Jan 15, 2009 at 2:55 PM, Bill Preston <billpreston@crownepointe.net> wrote: > Nothing special about that table. One index. > > It really seems that the system would grind to a stand-still when a lot of > non-transaction inserts were run combined with the creation of some large > temp tables. > > Since we added transactions and started using truncate, things have cleared > up nicely. The suggestions here really helped. > > Does anyone know of some established postgresql consultants that can be > hired for emergency analysis/tuning when things come up? There are several companies who have employees on this list who provide for fee contract / consulting work. If you're local to me and need help over a weekend I might have some spare time. :) But I'm generally pretty busy on weekends.
Thanks Scott. We are in Southern California. What I need someone for when the SHTF again, and if I can't handle it, I have some resource to get on the job right away. And it would help if they were a company that does this kind of thing so that I can get some buy in from those above. Rusty Scott Marlowe wrote: > On Thu, Jan 15, 2009 at 2:55 PM, Bill Preston > <billpreston@crownepointe.net> wrote: > >> Nothing special about that table. One index. >> >> It really seems that the system would grind to a stand-still when a lot of >> non-transaction inserts were run combined with the creation of some large >> temp tables. >> >> Since we added transactions and started using truncate, things have cleared >> up nicely. The suggestions here really helped. >> >> Does anyone know of some established postgresql consultants that can be >> hired for emergency analysis/tuning when things come up? >> > > There are several companies who have employees on this list who > provide for fee contract / consulting work. If you're local to me and > need help over a weekend I might have some spare time. :) But I'm > generally pretty busy on weekends. >
On Thu, Jan 15, 2009 at 2:36 PM, Bill Preston <billpreston@crownepointe.net> wrote: > We are in Southern California. > What I need someone for when the SHTF again, and if I can't handle it, I > have some resource to get on the job right away. And it would help if they > were a company that does this kind of thing so that I can get some buy in > from those above. Did you look here? http://www.postgresql.org/support/professional_support_northamerica Personally, I would first look to a company who currently pays active PostgreSQL developers - Command Prompt, EnterpriseDB are two prominent vendors on that list. Looking at their websites (I have not used the services of either) Command Prompt has a number you can call for round-the-clock support whether you are a customer or not and fairly clear pricing available up front. -Dave