Thread: Very high effective_cache_size == worse performance?
Howdy all, I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. 64bit OS. No users currently. I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so i don't think we can use copy. Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box. When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because on a much smaller machine I was able to do that same amount of records in 6 hours. My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give any indication that we had resource issues. So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size from 128GB to 2GB). Now the large box performs the same as the smaller box. (which is fine). incidentally, both tests were starting from a blank database. Is this expected? Thanks! Dave
On Tue, 2010-04-20 at 10:39 -0700, David Kerr wrote: > Howdy all, > > I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. > 64bit OS. No users currently. > > I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so > i don't think we can use copy. > > Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box. > > When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because > on a much smaller machine I was able to do that same amount of records in 6 hours. > > My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give > any indication that we had resource issues. > > So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size > from 128GB to 2GB). > > Now the large box performs the same as the smaller box. (which is fine). > > incidentally, both tests were starting from a blank database. > > Is this expected? Without a more complete picture of the configuration, this post doesn't mean a whole lot. Further, effective_cash_size is not likely to effect a bulk load at all. Joshua D. Drake > > Thanks! > > Dave > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering
On Tue, Apr 20, 2010 at 1:39 PM, David Kerr <dmk@mr-paradox.net> wrote: > Howdy all, > > I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. > 64bit OS. No users currently. > > I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so > i don't think we can use copy. > > Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box. > > When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because > on a much smaller machine I was able to do that same amount of records in 6 hours. > > My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give > any indication that we had resource issues. > > So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size > from 128GB to 2GB). > > Now the large box performs the same as the smaller box. (which is fine). > > incidentally, both tests were starting from a blank database. > > Is this expected? Lowering effective_cache_size tends to discourage the planner from using a nested-loop-with-inner-indexscan plan - that's it. What may be happening is that you may be loading data into some tables and then running a query against those tables before the autovacuum daemon has a chance to analyze them. I suspect that if you enable some logging you'll find that one of those queries is really, really slow, and that (by happy coincidence) discouraging it from using the index it thinks it should use happens to produce a better plan. What you should probably do is, for each table that you bulk load and then query, insert a manual ANALYZE between the two. ...Robert
On Tue, Apr 20, 2010 at 01:44:18PM -0400, Robert Haas wrote: - On Tue, Apr 20, 2010 at 1:39 PM, David Kerr <dmk@mr-paradox.net> wrote: - > My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give - > any indication that we had resource issues. - > - > So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size - > from 128GB to 2GB). - > - > Now the large box performs the same as the smaller box. (which is fine). - > - > incidentally, both tests were starting from a blank database. - > - > Is this expected? - - Lowering effective_cache_size tends to discourage the planner from - using a nested-loop-with-inner-indexscan plan - that's it. - - What may be happening is that you may be loading data into some tables - and then running a query against those tables before the autovacuum - daemon has a chance to analyze them. I suspect that if you enable - some logging you'll find that one of those queries is really, really - slow, and that (by happy coincidence) discouraging it from using the - index it thinks it should use happens to produce a better plan. What - you should probably do is, for each table that you bulk load and then - query, insert a manual ANALYZE between the two. - - ...Robert - that thought occured to me while I was testing this. I ran a vacuumdb -z on my database during the load and it didn't impact performance at all. Incidentally the code is written to work like this : while (read X lines in file){ Process those lines. write lines to DB. } So i would generally expect to get the benefits of the updated staticis once the loop ended. no? (would prepared statements affect that possibly?) Also, while I was debugging the problem, I did load a 2nd file into the DB ontop of one that had been loaded. So the statistics almost certinaly should have been decent at that point. I did turn on log_min_duration_statement but that caused performance to be unbearable, but i could turn it on again if it would help. Dave
On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote:
that thought occured to me while I was testing this. I ran a vacuumdb -zon my database during the load and it didn't impact performance at all.
Incidentally the code is written to work like this :
while (read X lines in file){
Process those lines.
write lines to DB.
}
So i would generally expect to get the benefits of the updated staticis
once the loop ended. no? (would prepared statements affect that possibly?)
Also, while I was debugging the problem, I did load a 2nd file into the DB
ontop of one that had been loaded. So the statistics almost certinaly should
have been decent at that point.
I did turn on log_min_duration_statement but that caused performance to be unbearable,
but i could turn it on again if it would help.
Dave
You can absolutely use copy if you like but you need to use a non-standard jdbc driver: kato.iki.fi/sw/db/postgresql/jdbc/copy/. I've used it in the past and it worked.
Is the whole thing going in in one transaction? I'm reasonably sure statistics aren't kept for uncommited transactions.
For inserts the prepared statements can only help. For selects they can hurt because eventually the JDBC driver will turn them into back end prepared statements that are only planned once. The price here is that that plan may not be the best plan for the data that you throw at it.
What was log_min_duration_statement logging that it killed performance?
--Nik
On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: > that thought occured to me while I was testing this. I ran a vacuumdb -z > on my database during the load and it didn't impact performance at all. The window to run ANALYZE usefully is pretty short. If you run it before the load is complete, your stats will be wrong. If you run it after the select statements that hit the table are planned, the updated stats won't arrive in time to do any good. > I did turn on log_min_duration_statement but that caused performance to be unbearable, > but i could turn it on again if it would help. I think you need to find a way to identify exactly which query is running slowly. You could sit there and run "select * from pg_stat_activity", or turn on log_min_duration_statement, or have your application print out timestamps at key points, or some other method... ...Robert
On Tue, Apr 20, 2010 at 11:39 AM, David Kerr <dmk@mr-paradox.net> wrote: > Howdy all, > > I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. > 64bit OS. No users currently. What's your IO subsystem look like? What did vmstat actually say?
David Kerr <dmk@mr-paradox.net> wrote: > Incidentally the code is written to work like this : > > while (read X lines in file){ > Process those lines. > write lines to DB. > } Unless you're selecting from multiple database tables in one query, effective_cache_size shouldn't make any difference. There's probably some other reason for the difference. A couple wild shots in the dark: Any chance the source files were cached the second time, but not the first? Do you have a large checkpoint_segments setting, and did the second run without a new initdb? -Kevin
On Tue, 20 Apr 2010, Nikolas Everett wrote: > You can absolutely use copy if you like but you need to use a non-standard > jdbc driver: kato.iki.fi/sw/db/postgresql/jdbc/copy/. I've used it in the > past and it worked. Copy support has been added to the 8.4 driver. Kris Jurka
On Tue, Apr 20, 2010 at 12:15 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Tue, Apr 20, 2010 at 11:39 AM, David Kerr <dmk@mr-paradox.net> wrote: >> Howdy all, >> >> I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. >> 64bit OS. No users currently. > > What's your IO subsystem look like? What did vmstat actually say? Note that on a 48 core machine, if vmstat shows 2% wait and 98% idle then you'd be 100% io bound, because it's % of total CPU. iostat -x 10 will give a better view of how hard your disks are working, and if they're the issue.
On Tue, Apr 20, 2010 at 02:12:15PM -0400, Nikolas Everett wrote: - On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: - - > that thought occured to me while I was testing this. I ran a vacuumdb -z - > on my database during the load and it didn't impact performance at all. - > - > Incidentally the code is written to work like this : - > - > while (read X lines in file){ - > Process those lines. - > write lines to DB. - > } - > - > So i would generally expect to get the benefits of the updated staticis - > once the loop ended. no? (would prepared statements affect that possibly?) - > - > Also, while I was debugging the problem, I did load a 2nd file into the DB - > ontop of one that had been loaded. So the statistics almost certinaly - > should - > have been decent at that point. - > - > I did turn on log_min_duration_statement but that caused performance to be - > unbearable, - > but i could turn it on again if it would help. - > - > Dave - - - You can absolutely use copy if you like but you need to use a non-standard - jdbc driver: kato.iki.fi/sw/db/postgresql/jdbc/copy/. I've used it in the - past and it worked. - - Is the whole thing going in in one transaction? I'm reasonably sure - statistics aren't kept for uncommited transactions. - - For inserts the prepared statements can only help. For selects they can - hurt because eventually the JDBC driver will turn them into back end - prepared statements that are only planned once. The price here is that that - plan may not be the best plan for the data that you throw at it. - - What was log_min_duration_statement logging that it killed performance? - - --Nik Good to know about the jdbc-copy. but this is a huge project and the load is just one very very tiny component, I don't think we could introduce anything new to assist that. It's not all in one tx. I don't have visibility to the code to determine how it's broken down, but most likely each while loop is a tx. I set it to log all statements (i.e., = 0.). that doubled the load time from ~15 to ~30 hours. I could, of course, be more granular if it would be helpful. Dave
On Tue, Apr 20, 2010 at 12:20 PM, David Kerr <dmk@mr-paradox.net> wrote: > On Tue, Apr 20, 2010 at 02:12:15PM -0400, Nikolas Everett wrote: > - On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: > - > - > that thought occured to me while I was testing this. I ran a vacuumdb -z > - > on my database during the load and it didn't impact performance at all. > - > > - > Incidentally the code is written to work like this : > - > > - > while (read X lines in file){ > - > Process those lines. > - > write lines to DB. > - > } > - > > - > So i would generally expect to get the benefits of the updated staticis > - > once the loop ended. no? (would prepared statements affect that possibly?) > - > > - > Also, while I was debugging the problem, I did load a 2nd file into the DB > - > ontop of one that had been loaded. So the statistics almost certinaly > - > should > - > have been decent at that point. > - > > - > I did turn on log_min_duration_statement but that caused performance to be > - > unbearable, > - > but i could turn it on again if it would help. > - > > - > Dave > - > - > - You can absolutely use copy if you like but you need to use a non-standard > - jdbc driver: kato.iki.fi/sw/db/postgresql/jdbc/copy/. I've used it in the > - past and it worked. > - > - Is the whole thing going in in one transaction? I'm reasonably sure > - statistics aren't kept for uncommited transactions. > - > - For inserts the prepared statements can only help. For selects they can > - hurt because eventually the JDBC driver will turn them into back end > - prepared statements that are only planned once. The price here is that that > - plan may not be the best plan for the data that you throw at it. > - > - What was log_min_duration_statement logging that it killed performance? > - > - --Nik > > Good to know about the jdbc-copy. but this is a huge project and the load is > just one very very tiny component, I don't think we could introduce anything > new to assist that. > > It's not all in one tx. I don't have visibility to the code to determine how > it's broken down, but most likely each while loop is a tx. > > I set it to log all statements (i.e., = 0.). that doubled the load time from > ~15 to ~30 hours. I could, of course, be more granular if it would be helpful. So are you logging to the same drive that has pg_xlog and your data/base directory on this machine?
On Tue, Apr 20, 2010 at 12:23:51PM -0600, Scott Marlowe wrote: - On Tue, Apr 20, 2010 at 12:20 PM, David Kerr <dmk@mr-paradox.net> wrote: - > On Tue, Apr 20, 2010 at 02:12:15PM -0400, Nikolas Everett wrote: - > - On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: - > - - > - You can absolutely use copy if you like but you need to use a non-standard - > - jdbc driver: kato.iki.fi/sw/db/postgresql/jdbc/copy/. I've used it in the - > - past and it worked. - > - - > - Is the whole thing going in in one transaction? I'm reasonably sure - > - statistics aren't kept for uncommited transactions. - > - - > - For inserts the prepared statements can only help. For selects they can - > - hurt because eventually the JDBC driver will turn them into back end - > - prepared statements that are only planned once. The price here is that that - > - plan may not be the best plan for the data that you throw at it. - > - - > - What was log_min_duration_statement logging that it killed performance? - > - - > - --Nik - > - > Good to know about the jdbc-copy. but this is a huge project and the load is - > just one very very tiny component, I don't think we could introduce anything - > new to assist that. - > - > It's not all in one tx. I don't have visibility to the code to determine how - > it's broken down, but most likely each while loop is a tx. - > - > I set it to log all statements (i.e., = 0.). that doubled the load time from - > ~15 to ~30 hours. I could, of course, be more granular if it would be helpful. - - So are you logging to the same drive that has pg_xlog and your - data/base directory on this machine? - the db, xlog and logs are all on separate areas of the SAN. separate I/O controllers, etc on the SAN. it's setup well, I wouldn't expect contention there. I'm logging via syslog, I've had trouble with that before. when i moved to syslog-ng on my dev environments that mostly resoved the probelm for me. but these machines still have vanilla syslog. Dave
On Tue, Apr 20, 2010 at 12:28 PM, David Kerr <dmk@mr-paradox.net> wrote: > > I'm logging via syslog, I've had trouble with that before. when i moved to syslog-ng > on my dev environments that mostly resoved the probelm for me. but these machines > still have vanilla syslog. Yea, I almost always log directly via stdout on production machines because of that.
On Tue, Apr 20, 2010 at 12:30:14PM -0600, Scott Marlowe wrote: - On Tue, Apr 20, 2010 at 12:28 PM, David Kerr <dmk@mr-paradox.net> wrote: - > - > I'm logging via syslog, I've had trouble with that before. when i moved to syslog-ng - > on my dev environments that mostly resoved the probelm for me. but these machines - > still have vanilla syslog. - - Yea, I almost always log directly via stdout on production machines - because of that. - Ah well good to know i'm not the only one =) I'll get the query info. I've got a twin system that I can use and abuse. Dave
On Tue, Apr 20, 2010 at 01:17:02PM -0500, Kevin Grittner wrote: - David Kerr <dmk@mr-paradox.net> wrote: - - > Incidentally the code is written to work like this : - > - > while (read X lines in file){ - > Process those lines. - > write lines to DB. - > } - - Unless you're selecting from multiple database tables in one query, - effective_cache_size shouldn't make any difference. There's - probably some other reason for the difference. - - A couple wild shots in the dark: - - Any chance the source files were cached the second time, but not the - first? - - Do you have a large checkpoint_segments setting, and did the second - run without a new initdb? - - -Kevin no i don't think the files would be cached the 2nd time. I ran it multiple times and got the same performance each time. It wasn't until i changed the parameter that performance got better. I've got checkpoint_segments = 300 Dave
On Tue, Apr 20, 2010 at 02:15:19PM -0400, Robert Haas wrote: - On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: - > that thought occured to me while I was testing this. I ran a vacuumdb -z - > on my database during the load and it didn't impact performance at all. - - The window to run ANALYZE usefully is pretty short. If you run it - before the load is complete, your stats will be wrong. If you run it - after the select statements that hit the table are planned, the - updated stats won't arrive in time to do any good. right, but i'm loading 20 million records in 1000 record increments. so the analyze should affect all subsequent increments, no? - > I did turn on log_min_duration_statement but that caused performance to be unbearable, - > but i could turn it on again if it would help. - - I think you need to find a way to identify exactly which query is - running slowly. You could sit there and run "select * from - pg_stat_activity", or turn on log_min_duration_statement, or have your - application print out timestamps at key points, or some other - method... I'm on it. Dave
On Tue, Apr 20, 2010 at 12:47 PM, David Kerr <dmk@mr-paradox.net> wrote: > On Tue, Apr 20, 2010 at 02:15:19PM -0400, Robert Haas wrote: > - On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: > - > that thought occured to me while I was testing this. I ran a vacuumdb -z > - > on my database during the load and it didn't impact performance at all. > - > - The window to run ANALYZE usefully is pretty short. If you run it > - before the load is complete, your stats will be wrong. If you run it > - after the select statements that hit the table are planned, the > - updated stats won't arrive in time to do any good. > > right, but i'm loading 20 million records in 1000 record increments. so > the analyze should affect all subsequent increments, no? I keep thinking FK checks are taking a long time because they aren't cached because in import they went through the ring buffer in pg or some other way aren't in a buffer but large effective cache size says it's 99.99% chance or better that it's in cache, and chooses a poor plan to look them up. Just a guess.
On Tue, Apr 20, 2010 at 12:28 PM, David Kerr <dmk@mr-paradox.net> wrote: > On Tue, Apr 20, 2010 at 12:23:51PM -0600, Scott Marlowe wrote: > - So are you logging to the same drive that has pg_xlog and your > - data/base directory on this machine? > - > > the db, xlog and logs are all on separate areas of the SAN. > > separate I/O controllers, etc on the SAN. it's setup well, I wouldn't expect > contention there. Same xkb/s gigabit connection?
David Kerr wrote: > the db, xlog and logs are all on separate areas of the SAN. > separate I/O controllers, etc on the SAN. it's setup well, I wouldn't expect > contention there. > Just because you don't expect it doesn't mean it's not there. Particularly something as complicated as a SAN setup, presuming anything without actually benchmarking it is a recipe for fuzzy diagnostics when problems pop up. If you took anyone's word that your SAN has good performance without confirming it yourself, that's a path that's lead many to trouble. Anyway, as Robert already stated, effective_cache_size only impacts how some very specific types of queries are executed; that's it. If there's some sort of query behavior involved in your load, maybe that has something to do with your slowdown, but it doesn't explain general slow performance. Other possibilities include that something else changed when you reloaded the server as part of that, or it's a complete coincidence--perhaps autoanalyze happened to finish at around the same time and it lead to new plans. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Tue, Apr 20, 2010 at 04:26:52PM -0400, Greg Smith wrote: - David Kerr wrote: - >the db, xlog and logs are all on separate areas of the SAN. - >separate I/O controllers, etc on the SAN. it's setup well, I wouldn't - >expect - >contention there. - > - - Just because you don't expect it doesn't mean it's not there. - Particularly something as complicated as a SAN setup, presuming anything - without actually benchmarking it is a recipe for fuzzy diagnostics when - problems pop up. If you took anyone's word that your SAN has good - performance without confirming it yourself, that's a path that's lead - many to trouble. that's actually what I'm doing, performance testing this environment. everything's on the table for me at this point. - Anyway, as Robert already stated, effective_cache_size only impacts how - some very specific types of queries are executed; that's it. If there's - some sort of query behavior involved in your load, maybe that has - something to do with your slowdown, but it doesn't explain general slow - performance. Other possibilities include that something else changed - when you reloaded the server as part of that, or it's a complete - coincidence--perhaps autoanalyze happened to finish at around the same - time and it lead to new plans. Ok that's good to know. I didn't think it would have any impact, and was surprised when it appeared to. I just finished running the test on another machine and wasn't able to reproduce the problem, so that's good news in some ways. But now i'm back to the drawing board. I don't think it's anything in the Db that's causing it. ( drop and re-create the db between tests) I actually suspect a hardware issue somewhere. Dave
On Tue, 2010-04-20 at 10:39 -0700, David Kerr wrote: > Howdy all, > > I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. > 64bit OS. No users currently. > > I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so > i don't think we can use copy. > > Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box. > > When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because > on a much smaller machine I was able to do that same amount of records in 6 hours. > > My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give > any indication that we had resource issues. > > So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size > from 128GB to 2GB). > > Now the large box performs the same as the smaller box. (which is fine). > > incidentally, both tests were starting from a blank database. > > Is this expected? Without a more complete picture of the configuration, this post doesn't mean a whole lot. Further, effective_cash_size is not likely to effect a bulk load at all. Joshua D. Drake > > Thanks! > > Dave > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering
David Kerr wrote: > I don't think it's anything in the Db that's causing it. ( drop and re-create > the db between tests) I actually suspect a hardware issue somewhere. > You might find my "Database Hardware Benchmarking" talk, available at http://projects.2ndquadrant.com/talks , useful to help sort out what's good and bad on each server, and correspondingly what'd different between the two. Many of the ideas there came from fighting with SAN hardware that didn't do what I expected. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Apr 20, 2010, at 12:22 PM, Scott Marlowe wrote: > On Tue, Apr 20, 2010 at 12:47 PM, David Kerr <dmk@mr-paradox.net> wrote: >> On Tue, Apr 20, 2010 at 02:15:19PM -0400, Robert Haas wrote: >> - On Tue, Apr 20, 2010 at 2:03 PM, David Kerr <dmk@mr-paradox.net> wrote: >> - > that thought occured to me while I was testing this. I ran a vacuumdb -z >> - > on my database during the load and it didn't impact performance at all. >> - >> - The window to run ANALYZE usefully is pretty short. If you run it >> - before the load is complete, your stats will be wrong. If you run it >> - after the select statements that hit the table are planned, the >> - updated stats won't arrive in time to do any good. >> >> right, but i'm loading 20 million records in 1000 record increments. so >> the analyze should affect all subsequent increments, no? > > I keep thinking FK checks are taking a long time because they aren't > cached because in import they went through the ring buffer in pg or > some other way aren't in a buffer but large effective cache size says > it's 99.99% chance or better that it's in cache, and chooses a poor > plan to look them up. Just a guess. > Yeah, I was thinking the same thing. If possible make sure the table either has no indexes and FK's or only the minimum required (PK?) while doing the load, thenadd the indexes and FK's later. Whether this is possible depends on what the schema is and what must be known by the app to load the data, but if you cando it its a huge win. Of course, if its not all in one transaction and there is any other concurrency going on that could be a bad idea. Or, ifthis is not a load on a fresh table but an append/update it may not be possible to drop some of the indexes first. Generally speaking, a load on a table without an index followed by index creation is at least twice as fast, and often 5xas fast or more. This is less true if each row is an individual insert and batching or 'insert into foo values (a, b,c, ...), (a2, b2, c2, ...)' multiple row syntax is not used. > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance