Thread: Slow query performance on large table
Hi,
I am executing a query on a table:
Table "public.measurement"
Column | Type | Modifiers
------------+-----------------------+-----------
assessment | integer |
time | integer |
value | character varying(50) |
Indexes: idx_measurement_assessment btree (assessment),
idx_measurement_time btree ("time")
The primary key of the table is a combination of assessment and time, and there are indexes on both assessment and time.
The query I am executing is
Select time,value
From measurement
Where assessment = ?
And time between ? and ?
This used to run like a rocket before my database got a little larger. There are now around 15 million rows in the table and it is taking a long time to execute queries that get a fair number of rows back (c.300)
The database is ‘VACUUM ANALYZED’ regularly, and I’ve upped the shared buffers to a significant amount.
I’ve tried it on various machine configurations now. A dual processor Linux/Intel Machine with 1G of Memory, (0.5G shared buffers). A single processor Linux/Intel Machine (0.25G shared buffers) , and a Solaris machine (0.25G shared buffers). I’m getting similar performance on all of them.
Anybody see anything I’ve obviously done wrong? Any ways of improving the performance of this query?
Thanks in advance.
Paul McKay.
======================================
Paul Mckay
Consultant Partner
Servicing Division
Clearwater-IT
e:paul_mckay@clearwater-it.co.uk
t:0161 877 6090
m: 07713 510946
======================================
Paul McKay wrote: > Hi, > > > > I am executing a query on a table: > > > > Table "public.measurement" > > Column | Type | Modifiers > > ------------+-----------------------+----------- > > assessment | integer | > > time | integer | > > value | character varying(50) | > > Indexes: idx_measurement_assessment btree (assessment), > > idx_measurement_time btree ("time") > > > > The primary key of the table is a combination of assessment and time, > and there are indexes on both assessment and time. > > > > The query I am executing is > > > > Select time,value > > From measurement > > Where assessment = ? > > And time between ? and ? Changing 2 indexes into one both-fields index should improve performance much. create index ind_meas on measurement (assessment,time). Regards, Tomasz Myrta
"Paul McKay" <paul_mckay@clearwater-it.co.uk> writes: > The query I am executing is > Select time,value > From measurement > Where assessment = ? > And time between ? and ? EXPLAIN ANALYZE would help you investigate this. Is it using an indexscan? On which index? Does forcing use of the other index (by temporarily dropping the preferred one) improve matters? Possibly a two-column index on both assessment and time would be an improvement, but it's hard to guess without knowing anything about the selectivity of the two WHERE clauses. regards, tom lane
On Tue, Mar 04, 2003 at 02:45:18PM -0000, Paul McKay wrote: > > Select time,value > > >From measurement > > Where assessment = ? > > And time between ? and ? > Please run this with EXPLAIN ANALYSE with values that slow the query down. By bet is that you have an index which needs wider statistics setting on the column to be useful, but without the output from EXAPLIN ANALYSE it'll be hard to tell. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
The results were clearview=# explain analyse clearview-# select assessment,time clearview-# from measurement clearview-# where assessment = 53661 clearview-# and time between 1046184261 and 1046335461; NOTICE: QUERY PLAN: Index Scan using idx_measurement_assessment on measurement (cost=0.00..34668.61 rows=261 width=8) (actual time=26128.07..220584.69 rows=503 loops=1) Total runtime: 220587.06 msec EXPLAIN After adding the index kindly suggested by yourself and Tomasz I get, clearview=# explain analyse clearview-# select assessment,time clearview-# from measurement clearview-# where assessment = 53661 clearview-# and time between 1046184261 and 1046335461; NOTICE: QUERY PLAN: Index Scan using ind_measurement_ass_time on measurement (cost=0.00..1026.92 rows=261 width=8) (actual time=15.37..350.46 rows=503 loops=1) Total runtime: 350.82 msec EXPLAIN I vaguely recall doing a bit of a reorganize on this database a bit back and it looks like I lost the primary Key index. No wonder it was going slow. Thanks a lot for your help. Paul Mckay. ====================================== Paul Mckay Consultant Partner Servicing Division Clearwater-IT e:paul_mckay@clearwater-it.co.uk t:0161 877 6090 m: 07713 510946 ====================================== -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: 04 March 2003 15:13 To: Paul McKay Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query performance on large table "Paul McKay" <paul_mckay@clearwater-it.co.uk> writes: > The query I am executing is > Select time,value > From measurement > Where assessment = ? > And time between ? and ? EXPLAIN ANALYZE would help you investigate this. Is it using an indexscan? On which index? Does forcing use of the other index (by temporarily dropping the preferred one) improve matters? Possibly a two-column index on both assessment and time would be an improvement, but it's hard to guess without knowing anything about the selectivity of the two WHERE clauses. regards, tom lane
Tom Lane wrote: >"Paul McKay" <paul_mckay@clearwater-it.co.uk> writes: > > >>The query I am executing is >>Select time,value >From measurement >>Where assessment = ? >>And time between ? and ? >> >> > >EXPLAIN ANALYZE would help you investigate this. Is it using an >indexscan? On which index? Does forcing use of the other index >(by temporarily dropping the preferred one) improve matters? > >Possibly a two-column index on both assessment and time would be >an improvement, but it's hard to guess without knowing anything >about the selectivity of the two WHERE clauses. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- > > Tom, does this mean that a primary key alone might not be enough? As far as I understood Paul, the PK looks quite as the newly created index does, so "create index ind_meas on measurement (assessment,time)" should perform the same as "... primary key(assessment,time)". Do possibly non-optimal indices (only assessment, only time as Paul described earlier) screw up the optimizer, igoring the better option usiing the PK? Obviously, the index used should be combined of (assessment,time) but IMHO a PK should be enough. regards, Andreas
On Tue, 2003-03-04 at 11:11, Paul McKay wrote: > The results were > > clearview=# explain analyse > clearview-# select assessment,time > clearview-# from measurement > clearview-# where assessment = 53661 > clearview-# and time between 1046184261 and 1046335461; > > NOTICE: QUERY PLAN: > > Index Scan using idx_measurement_assessment on measurement > (cost=0.00..34668.61 rows=261 width=8) (actual time=26128.07..220584.69 > rows=503 loops=1) > Total runtime: 220587.06 msec > > EXPLAIN > > After adding the index kindly suggested by yourself and Tomasz I get, > > clearview=# explain analyse > clearview-# select assessment,time > clearview-# from measurement > clearview-# where assessment = 53661 > clearview-# and time between 1046184261 and 1046335461; > NOTICE: QUERY PLAN: > > Index Scan using ind_measurement_ass_time on measurement > (cost=0.00..1026.92 rows=261 width=8) (actual time=15.37..350.46 > rows=503 loops=1) > Total runtime: 350.82 msec > > EXPLAIN > > > I vaguely recall doing a bit of a reorganize on this database a bit back > and it looks like I lost the primary Key index. No wonder it was going > slow. > Maybe it's just me, but I get the feeling you need to work some regular reindexing into your maintenance schedule. Given your query is using between, I don't think it would use the index on the time field anyway (and explain analyze seems to be supporting this). Rewrite it so that you have a and time > foo and time < bar and I think you'll see a difference. With that in mind, I think your speedier query results are due more to having a non-bloated index freshly created than the fact that it being a dual column index. Robert Treat
Paul, > Index Scan using idx_measurement_assessment on measurement > (cost=0.00..34668.61 rows=261 width=8) (actual time=26128.07..220584.69 > rows=503 loops=1) > Total runtime: 220587.06 msec These query results say to me that you need to do both a VACUUM FULL and a REINDEX on this table. The 26-second delay before returning the first row says "table/index with lots of dead pages" to me. For the future, you should consider dramatically increasing your FSM settings and working a regular VACUUM FULL and REINDEX into your maintainence jobs. -- Josh Berkus Aglio Database Solutions San Francisco
I used the between .. and in a vain attempt to improve performance! Running with < and > improves the performance again by about 10 times. The explain's below were ran on a test server I was using (not the live server) where I had recreated the database in order to investigate matters, so all the indexes were newly created anyway. The dual column index was the key (literally). ====================================== Paul Mckay Consultant Partner Servicing Division Clearwater-IT e:paul_mckay@clearwater-it.co.uk t:0161 877 6090 m: 07713 510946 ====================================== -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Robert Treat Sent: 04 March 2003 17:02 To: Paul McKay Cc: 'Tom Lane'; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query performance on large table On Tue, 2003-03-04 at 11:11, Paul McKay wrote: > The results were > > clearview=# explain analyse > clearview-# select assessment,time > clearview-# from measurement > clearview-# where assessment = 53661 > clearview-# and time between 1046184261 and 1046335461; > > NOTICE: QUERY PLAN: > > Index Scan using idx_measurement_assessment on measurement > (cost=0.00..34668.61 rows=261 width=8) (actual time=26128.07..220584.69 > rows=503 loops=1) > Total runtime: 220587.06 msec > > EXPLAIN > > After adding the index kindly suggested by yourself and Tomasz I get, > > clearview=# explain analyse > clearview-# select assessment,time > clearview-# from measurement > clearview-# where assessment = 53661 > clearview-# and time between 1046184261 and 1046335461; > NOTICE: QUERY PLAN: > > Index Scan using ind_measurement_ass_time on measurement > (cost=0.00..1026.92 rows=261 width=8) (actual time=15.37..350.46 > rows=503 loops=1) > Total runtime: 350.82 msec > > EXPLAIN > > > I vaguely recall doing a bit of a reorganize on this database a bit back > and it looks like I lost the primary Key index. No wonder it was going > slow. > Maybe it's just me, but I get the feeling you need to work some regular reindexing into your maintenance schedule. Given your query is using between, I don't think it would use the index on the time field anyway (and explain analyze seems to be supporting this). Rewrite it so that you have a and time > foo and time < bar and I think you'll see a difference. With that in mind, I think your speedier query results are due more to having a non-bloated index freshly created than the fact that it being a dual column index. Robert Treat ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Robert Treat <xzilla@users.sourceforge.net> writes: > Maybe it's just me, but I get the feeling you need to work some regular > reindexing into your maintenance schedule. Or at least, more vacuuming... > Given your query is using > between, I don't think it would use the index on the time field anyway > (and explain analyze seems to be supporting this). Rewrite it so that > you have a and time > foo and time < bar and I think you'll see a > difference. No, you won't, because that's exactly what BETWEEN is. regards, tom lane
Andreas Pflug wrote: > Tom, > > does this mean that a primary key alone might not be enough? As far as I > understood Paul, the PK looks quite as the newly created index does, so > "create index ind_meas on measurement (assessment,time)" should perform > the same as "... primary key(assessment,time)". > Do possibly non-optimal indices (only assessment, only time as Paul > described earlier) screw up the optimizer, igoring the better option > usiing the PK? Obviously, the index used should be combined of > (assessment,time) but IMHO a PK should be enough. > > regards, > > Andreas You are right - primary key should be ok, but Paul lost it. psql \d shows primary key indexes, but in this case there was no such primary key. Regards, Tomasz Myrta
Robert Treat wrote: > Maybe it's just me, but I get the feeling you need to work some regular > reindexing into your maintenance schedule. Given your query is using > between, I don't think it would use the index on the time field anyway > (and explain analyze seems to be supporting this). Rewrite it so that > you have a and time > foo and time < bar and I think you'll see a > difference. With that in mind, I think your speedier query results are > due more to having a non-bloated index freshly created than the fact > that it being a dual column index. > > Robert Treat Do you know anything about between, what should we know? I made some tests, and there was no noticable difference between them: pvwatch=# EXPLAIN analyze * from stats where hostid=1 and stp between 1 and 2; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using ind_stats on stats (cost=0.00..6.01 rows=1 width=28) (actual time=0.00..0.00 rows=0 loops=1) Index Cond: ((hostid = 1) AND (stp >= 1) AND (stp <= 2)) Total runtime: 0.00 msec (3 rows) pvwatch=# EXPLAIN analyze SELECT * from stats where hostid=1 and stp> 1 and stp<2; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using ind_stats on stats (cost=0.00..6.01 rows=1 width=28) (actual time=0.00..0.00 rows=0 loops=1) Index Cond: ((hostid = 1) AND (stp > 1) AND (stp < 2)) Total runtime: 0.00 msec (3 rows) Regards, Tomasz Myrta
Andreas Pflug <Andreas.Pflug@web.de> writes: > "create index ind_meas on measurement (assessment,time)" should perform > the same as "... primary key(assessment,time)". Sure. > Do possibly non-optimal indices (only assessment, only time as Paul > described earlier) screw up the optimizer, igoring the better option > usiing the PK? One would like to think the optimizer will make the right choice. But using a two-column index just because it's there isn't necessarily the right choice. The two-column index will certainly be bulkier and more expensive to scan, so if there's a one-column index that's nearly as selective, it might be a better choice. regards, tom lane
Tomasz Myrta wrote: > You are right - primary key should be ok, but Paul lost it. psql \d > shows primary key indexes, but in this case there was no such primary > key. > > Regards, > Tomasz Myrta > Ok, then my view of the world is all right again. Re Tom Lane > One would like to think the optimizer will make the right choice. But > using a two-column index just because it's there isn't necessarily the > right choice. The two-column index will certainly be bulkier and more > expensive to scan, so if there's a one-column index that's nearly as > selective, it might be a better choice. If I know that the access pattern of my app looks as if it will need a multipart index I should create it. If the optimizer finds out, a simpler one will fit better, all right, it knows better (if properly VACUUMed :-). But it's still good practice to offer complete indices. Will pgsql use a multipart index as efficiently for simpler queries as a shorter one covering only the first columns? In this example, the (assessment, time) index could replace the (accessment) index, but certainly not the (time) index. I tend to design longer indices with hopefully valuable columns. In this context: From MSSQL, I know "covering indices". Imagine a table t with many columns, and an index on (a,b,c). in MSSQL, SELECT c from t where (a ... AND b...) will use that index to retrieve the c column value also without touching the row data. In a sense, the index is used as an alternative table. Does pgsql profit from this kind of indices also? Regards, Andreas
Hopefully you guys can help me with another query I've got that's running slow. This time it's across two tables I have clearview=# \d panconversation Table "panconversation" Column | Type | Modifiers -------------+---------+----------- assessment | integer | not null interface | integer | source | integer | destination | integer | protocol | integer | Indexes: idx_panconversation_destination, idx_panconversation_interface, idx_panconversation_protocol, idx_panconversation_source Primary key: panconversation_pkey Unique keys: unq_panconversation Triggers: RI_ConstraintTrigger_52186648, RI_ConstraintTrigger_52186654, RI_ConstraintTrigger_52186660, RI_ConstraintTrigger_52186666 Primary key is assessment Along with the table I was dealing with before, with the index I'd mislaid put back in clearview=# \d measurement Table "measurement" Column | Type | Modifiers ------------+-----------------------+----------- assessment | integer | time | integer | value | character varying(50) | Indexes: idx_measurement_assessment, idx_measurement_time, ind_measurement_ass_time The 'explain analyse' of the query I am running is rather evil. clearview=# explain analyse select source,value clearview-# from measurement, PANConversation clearview-# where PANConversation.assessment = measurement.assessment clearview-# and Interface = 11 clearview-# and Time > 1046184261 and Time < 1046335461 clearview-# ; NOTICE: QUERY PLAN: Hash Join (cost=1532.83..345460.73 rows=75115 width=23) (actual time=1769.84..66687.11 rows=16094 loops=1) -> Seq Scan on measurement (cost=0.00..336706.07 rows=418859 width=15) (actual time=1280.11..59985.47 rows=455788 loops=1) -> Hash (cost=1498.21..1498.21 rows=13848 width=8) (actual time=253.49..253.49 rows=0 loops=1) -> Seq Scan on panconversation (cost=0.00..1498.21 rows=13848 width=8) (actual time=15.64..223.18 rows=13475 loops=1) Total runtime: 66694.82 msec EXPLAIN Anybody shed any light on why the indexes I created aren't being used, and I have these nasty sequential scans? Thanks in advance, Paul. ====================================== Paul Mckay Consultant Partner Servicing Division Clearwater-IT e:paul_mckay@clearwater-it.co.uk t:0161 877 6090 m: 07713 510946 ====================================== -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Tomasz Myrta Sent: 04 March 2003 17:21 To: Andreas Pflug Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query performance on large table Andreas Pflug wrote: > Tom, > > does this mean that a primary key alone might not be enough? As far as I > understood Paul, the PK looks quite as the newly created index does, so > "create index ind_meas on measurement (assessment,time)" should perform > the same as "... primary key(assessment,time)". > Do possibly non-optimal indices (only assessment, only time as Paul > described earlier) screw up the optimizer, igoring the better option > usiing the PK? Obviously, the index used should be combined of > (assessment,time) but IMHO a PK should be enough. > > regards, > > Andreas You are right - primary key should be ok, but Paul lost it. psql \d shows primary key indexes, but in this case there was no such primary key. Regards, Tomasz Myrta ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Paul McKay wrote: > Hopefully you guys can help me with another query I've got that's > running slow. > > This time it's across two tables I have > > clearview=# \d panconversation > Table "panconversation" > Column | Type | Modifiers > -------------+---------+----------- > assessment | integer | not null > interface | integer | > source | integer | > destination | integer | > protocol | integer | > Indexes: idx_panconversation_destination, > idx_panconversation_interface, > idx_panconversation_protocol, > idx_panconversation_source > Primary key: panconversation_pkey > Unique keys: unq_panconversation > Triggers: RI_ConstraintTrigger_52186648, > RI_ConstraintTrigger_52186654, > RI_ConstraintTrigger_52186660, > RI_ConstraintTrigger_52186666 > > Primary key is assessment > > Along with the table I was dealing with before, with the index I'd > mislaid put back in > > clearview=# \d measurement > Table "measurement" > Column | Type | Modifiers > ------------+-----------------------+----------- > assessment | integer | > time | integer | > value | character varying(50) | > Indexes: idx_measurement_assessment, > idx_measurement_time, > ind_measurement_ass_time > > The 'explain analyse' of the query I am running is rather evil. > > clearview=# explain analyse select source,value > clearview-# from measurement, PANConversation > clearview-# where PANConversation.assessment = > measurement.assessment > clearview-# and Interface = 11 > clearview-# and Time > 1046184261 and Time < 1046335461 > clearview-# ; > NOTICE: QUERY PLAN: > > Hash Join (cost=1532.83..345460.73 rows=75115 width=23) (actual > time=1769.84..66687.11 rows=16094 loops=1) > -> Seq Scan on measurement (cost=0.00..336706.07 rows=418859 > width=15) (actual time=1280.11..59985.47 rows=455788 loops=1) > -> Hash (cost=1498.21..1498.21 rows=13848 width=8) (actual > time=253.49..253.49 rows=0 loops=1) > -> Seq Scan on panconversation (cost=0.00..1498.21 rows=13848 > width=8) (actual time=15.64..223.18 rows=13475 loops=1) > Total runtime: 66694.82 msec > > EXPLAIN > > Anybody shed any light on why the indexes I created aren't being used, > and I have these nasty sequential scans? Measurement is sequentially scaned, because probably "interface=12" results in lot of records. Please, check how many rows you have - all rows in measurement/panconversation, - rows in measurement with "Interface"=12 - rows in panconversation between your time. Regards, Tomasz Myrta
clearview=# select count(*) from measurement; count ---------- 15302138 (1 row) clearview=# select count(*) from panconversation; count ------- 77217 (1 row) clearview=# select count(*) from panconversation where interface = 11; count ------- 13475 (1 row) clearview=# select count(*) from measurement where time > 1046184261 and time < 1046335461; count -------- 455788 (1 row) ====================================== Paul Mckay Consultant Partner Servicing Division Clearwater-IT e:paul_mckay@clearwater-it.co.uk t:0161 877 6090 m: 07713 510946 ====================================== -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Tomasz Myrta Sent: 05 March 2003 10:05 To: Paul McKay Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query performance on large table Paul McKay wrote: > Hopefully you guys can help me with another query I've got that's > running slow. > > This time it's across two tables I have > > clearview=# \d panconversation > Table "panconversation" > Column | Type | Modifiers > -------------+---------+----------- > assessment | integer | not null > interface | integer | > source | integer | > destination | integer | > protocol | integer | > Indexes: idx_panconversation_destination, > idx_panconversation_interface, > idx_panconversation_protocol, > idx_panconversation_source > Primary key: panconversation_pkey > Unique keys: unq_panconversation > Triggers: RI_ConstraintTrigger_52186648, > RI_ConstraintTrigger_52186654, > RI_ConstraintTrigger_52186660, > RI_ConstraintTrigger_52186666 > > Primary key is assessment > > Along with the table I was dealing with before, with the index I'd > mislaid put back in > > clearview=# \d measurement > Table "measurement" > Column | Type | Modifiers > ------------+-----------------------+----------- > assessment | integer | > time | integer | > value | character varying(50) | > Indexes: idx_measurement_assessment, > idx_measurement_time, > ind_measurement_ass_time > > The 'explain analyse' of the query I am running is rather evil. > > clearview=# explain analyse select source,value > clearview-# from measurement, PANConversation > clearview-# where PANConversation.assessment = > measurement.assessment > clearview-# and Interface = 11 > clearview-# and Time > 1046184261 and Time < 1046335461 > clearview-# ; > NOTICE: QUERY PLAN: > > Hash Join (cost=1532.83..345460.73 rows=75115 width=23) (actual > time=1769.84..66687.11 rows=16094 loops=1) > -> Seq Scan on measurement (cost=0.00..336706.07 rows=418859 > width=15) (actual time=1280.11..59985.47 rows=455788 loops=1) > -> Hash (cost=1498.21..1498.21 rows=13848 width=8) (actual > time=253.49..253.49 rows=0 loops=1) > -> Seq Scan on panconversation (cost=0.00..1498.21 rows=13848 > width=8) (actual time=15.64..223.18 rows=13475 loops=1) > Total runtime: 66694.82 msec > > EXPLAIN > > Anybody shed any light on why the indexes I created aren't being used, > and I have these nasty sequential scans? Measurement is sequentially scaned, because probably "interface=12" results in lot of records. Please, check how many rows you have - all rows in measurement/panconversation, - rows in measurement with "Interface"=12 - rows in panconversation between your time. Regards, Tomasz Myrta ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
On Wed, 5 Mar 2003 09:47:51 -0000, "Paul McKay" <paul_mckay@clearwater-it.co.uk> wrote: >Hash Join (cost=1532.83..345460.73 rows=75115 width=23) (actual >time=1769.84..66687.11 rows=16094 loops=1) > -> Seq Scan on measurement (cost=0.00..336706.07 rows=418859 >width=15) (actual time=1280.11..59985.47 rows=455788 loops=1) > -> Hash (cost=1498.21..1498.21 rows=13848 width=8) (actual >time=253.49..253.49 rows=0 loops=1) > -> Seq Scan on panconversation (cost=0.00..1498.21 rows=13848 >width=8) (actual time=15.64..223.18 rows=13475 loops=1) >Total runtime: 66694.82 msec |clearview=# select count(*) from measurement; | 15302138 |clearview=# select count(*) from panconversation; | 77217 Paul, you seem to have a lot of dead tuples in your tables. VACUUM FULL VERBOSE ANALYZE panconversation; VACUUM FULL VERBOSE ANALYZE measurement; This should cut your query time to ca. one third. If you could migrate to 7.3 and create your tables WITHOUT OIDS, I'd expect a further speed increase of ~ 15%. Servus Manfred