Thread: help speeding up a query in postgres 8.4.5
Would really appreciate someone taking a look at the query below.... Thanks in advance!
this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'
"Aggregate (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
" -> Hash Join (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
" Hash Cond: (invs.granule_id = gv.granule_id)"
" -> Seq Scan on invsensor invs (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
" Filter: (sensor_id = 13)"
" -> Hash (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
" -> Hash Join (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
" Hash Cond: (gv.granule_id = iv.granule_id)"
" -> Seq Scan on gran_ver gv (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
" -> Hash (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
" -> Bitmap Heap Scan on inventory iv (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
" Recheck Cond: (inv_id = 65)"
" -> Bitmap Index Scan on inven_idx1 (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
" Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"
some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB
the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 6MB
let me know if I've forgotten anything! thanks a bunch!!
Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.gov
this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'
"Aggregate (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
" -> Hash Join (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
" Hash Cond: (invs.granule_id = gv.granule_id)"
" -> Seq Scan on invsensor invs (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
" Filter: (sensor_id = 13)"
" -> Hash (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
" -> Hash Join (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
" Hash Cond: (gv.granule_id = iv.granule_id)"
" -> Seq Scan on gran_ver gv (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
" -> Hash (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
" -> Bitmap Heap Scan on inventory iv (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
" Recheck Cond: (inv_id = 65)"
" -> Bitmap Index Scan on inven_idx1 (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
" Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"
some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB
the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 6MB
let me know if I've forgotten anything! thanks a bunch!!
Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.gov
On 5 April 2011 21:25, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote:
Would really appreciate someone taking a look at the query below.... Thanks in advance!
this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'
"Aggregate (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
" -> Hash Join (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
" Hash Cond: (invs.granule_id = gv.granule_id)"
" -> Seq Scan on invsensor invs (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
" Filter: (sensor_id = 13)"
" -> Hash (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
" -> Hash Join (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
" Hash Cond: (gv.granule_id = iv.granule_id)"
" -> Seq Scan on gran_ver gv (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
" -> Hash (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
" -> Bitmap Heap Scan on inventory iv (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
" Recheck Cond: (inv_id = 65)"
" -> Bitmap Index Scan on inven_idx1 (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
" Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"
some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB
the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 6MB
let me know if I've forgotten anything! thanks a bunch!!
Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.gov
Hi,
could you show us indexes that you have on all tables from this query? Have you tried running vacuum analyze on those tables? Do you have autovacuum active?
regards
Szymon
> some additional info..... > the table inventory is about 4481 MB and also has postgis types. > the table gran_ver is about 523 MB > the table INVSENSOR is about 217 MB > > the server itself has 32G RAM with the following set in the postgres conf > shared_buffers = 3GB > work_mem = 64MB > maintenance_work_mem = 512MB > wal_buffers = 6MB Not sure how to improve the query itself - it's rather simple and the execution plan seems reasonable. You're dealing with a lot of data, so it takes time to process. Anyway, I'd try to bump up the shared buffers a bit (the tables you've listed have about 5.5 GB, so 3GB of shared buffers won't cover it). OTOH most of the data will be in pagecache maintained by the kernel anyway. Try to increase the work_mem a bit, that might speed up the hash joins (the two hash joins consumed about 15s, the whole query took 17s). This does not require a restart, just do set work_mem = '128MB' (or 256MB) and then run the query in the same session. Let's see if that works. regards Tomas
Autovacuum is not running - but regular vacuums are being done twice daily.
indexes on inventory:
CREATE INDEX inven_idx1
ON inventory
USING btree
(inv_id);
CREATE UNIQUE INDEX inven_idx2
ON inventory
USING btree
(granule_id);
indexes on gran_ver:
CREATE UNIQUE INDEX granver_idx1
ON gran_ver
USING btree
(granule_id);
indexes on sensor
CREATE INDEX invsnsr_idx2
ON invsensor
USING btree
(sensor_id);
On 4/6/11 7:41 AM, Szymon Guz wrote:
indexes on inventory:
CREATE INDEX inven_idx1
ON inventory
USING btree
(inv_id);
CREATE UNIQUE INDEX inven_idx2
ON inventory
USING btree
(granule_id);
indexes on gran_ver:
CREATE UNIQUE INDEX granver_idx1
ON gran_ver
USING btree
(granule_id);
indexes on sensor
CREATE INDEX invsnsr_idx2
ON invsensor
USING btree
(sensor_id);
On 4/6/11 7:41 AM, Szymon Guz wrote:
On 5 April 2011 21:25, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote:Would really appreciate someone taking a look at the query below.... Thanks in advance!
this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'
"Aggregate (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
" -> Hash Join (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
" Hash Cond: (invs.granule_id = gv.granule_id)"
" -> Seq Scan on invsensor invs (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
" Filter: (sensor_id = 13)"
" -> Hash (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
" -> Hash Join (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
" Hash Cond: (gv.granule_id = iv.granule_id)"
" -> Seq Scan on gran_ver gv (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
" -> Hash (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
" -> Bitmap Heap Scan on inventory iv (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
" Recheck Cond: (inv_id = 65)"
" -> Bitmap Index Scan on inven_idx1 (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
" Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"
some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB
the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 6MB
let me know if I've forgotten anything! thanks a bunch!!
Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.govHi,could you show us indexes that you have on all tables from this query? Have you tried running vacuum analyze on those tables? Do you have autovacuum active?regardsSzymon
thanks for the reply, Tomas. I'll test bumping up work_mem and see how that helps..... thanks again, Maria Wilson On 4/6/11 9:16 AM, tv@fuzzy.cz wrote: >> some additional info..... >> the table inventory is about 4481 MB and also has postgis types. >> the table gran_ver is about 523 MB >> the table INVSENSOR is about 217 MB >> >> the server itself has 32G RAM with the following set in the postgres conf >> shared_buffers = 3GB >> work_mem = 64MB >> maintenance_work_mem = 512MB >> wal_buffers = 6MB > Not sure how to improve the query itself - it's rather simple and the > execution plan seems reasonable. You're dealing with a lot of data, so it > takes time to process. > > Anyway, I'd try to bump up the shared buffers a bit (the tables you've > listed have about 5.5 GB, so 3GB of shared buffers won't cover it). OTOH > most of the data will be in pagecache maintained by the kernel anyway. > > Try to increase the work_mem a bit, that might speed up the hash joins > (the two hash joins consumed about 15s, the whole query took 17s). This > does not require a restart, just do > > set work_mem = '128MB' > > (or 256MB) and then run the query in the same session. Let's see if that > works. > > regards > Tomas >
"Maria L. Wilson" <Maria.L.Wilson-1@nasa.gov> wrote: > Autovacuum is not running - but regular vacuums are being done > twice daily. Is the ANALYZE keyword used on those VACUUM runs? What version of PostgreSQL is this. If it's enough to need fsm settings, do you run with the VERBOSE option and check the end of the output to make sure they are set high enough? -Kevin
Dne 6.4.2011 17:33, Kevin Grittner napsal(a): > "Maria L. Wilson" <Maria.L.Wilson-1@nasa.gov> wrote: > >> Autovacuum is not running - but regular vacuums are being done >> twice daily. > > Is the ANALYZE keyword used on those VACUUM runs? What version of > PostgreSQL is this. If it's enough to need fsm settings, do you run > with the VERBOSE option and check the end of the output to make sure > they are set high enough? Why do you think the problem is related to stale stats? It seems to me fairly accurate - see the explain analyze in the first post). All the nodes are less than 1% off (which is great), except for the last hash join that returns 336949 rows instead of 924675 expected rows. Maybe I'm missing something, but the stats seem to be quite accurate and there is just very little dead tuples I guess. regards Tomas
yep - we use analyze and check the output. It's version 8.4.5 so no fsm issues. thanks, Maria On 4/6/11 11:33 AM, Kevin Grittner wrote: > "Maria L. Wilson"<Maria.L.Wilson-1@nasa.gov> wrote: > >> Autovacuum is not running - but regular vacuums are being done >> twice daily. > > Is the ANALYZE keyword used on those VACUUM runs? What version of > PostgreSQL is this. If it's enough to need fsm settings, do you run > with the VERBOSE option and check the end of the output to make sure > they are set high enough? > > -Kevin
On Tue, Apr 5, 2011 at 3:25 PM, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote: > Would really appreciate someone taking a look at the query below.... Thanks > in advance! > > > this is on a linux box... > Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 > EST 2009 x86_64 x86_64 x86_64 GNU/Linux > > explain analyze > select MIN(IV.STRTDATE), MAX(IV.ENDDATE) > from GRAN_VER GV > left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR > INVS > where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and > INVS.sensor_id='13' > > > "Aggregate (cost=736364.52..736364.53 rows=1 width=8) (actual > time=17532.930..17532.930 rows=1 loops=1)" > " -> Hash Join (cost=690287.33..734679.77 rows=336949 width=8) (actual > time=13791.593..17323.080 rows=924675 loops=1)" > " Hash Cond: (invs.granule_id = gv.granule_id)" > " -> Seq Scan on invsensor invs (cost=0.00..36189.41 rows=1288943 > width=4) (actual time=0.297..735.375 rows=1277121 loops=1)" > " Filter: (sensor_id = 13)" > " -> Hash (cost=674401.52..674401.52 rows=1270865 width=16) (actual > time=13787.698..13787.698 rows=1270750 loops=1)" > " -> Hash Join (cost=513545.62..674401.52 rows=1270865 > width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)" > " Hash Cond: (gv.granule_id = iv.granule_id)" > " -> Seq Scan on gran_ver gv (cost=0.00..75224.90 > rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)" > " -> Hash (cost=497659.81..497659.81 rows=1270865 > width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)" > " -> Bitmap Heap Scan on inventory iv > (cost=24050.00..497659.81 rows=1270865 width=12) (actual > time=253.542..1387.957 rows=1270750 loops=1)" > " Recheck Cond: (inv_id = 65)" > " -> Bitmap Index Scan on inven_idx1 > (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 > rows=1270977 loops=1)" > " Index Cond: (inv_id = 65)" > "Total runtime: 17533.100 ms" > > some additional info..... > the table inventory is about 4481 MB and also has postgis types. > the table gran_ver is about 523 MB > the table INVSENSOR is about 217 MB > > the server itself has 32G RAM with the following set in the postgres conf > shared_buffers = 3GB > work_mem = 64MB > maintenance_work_mem = 512MB > wal_buffers = 6MB > > let me know if I've forgotten anything! thanks a bunch!! Late response here, but... Is there an index on invsensor (sensor_id, granule_id)? If not, that might be something to try. If so, you might want to try to figure out why it's not being used. Likewise, is there an index on gran_ver (granule_id)? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
[ woops, accidentally replied off-list, trying again ] On Tue, May 10, 2011 at 1:47 PM, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote: > thanks for taking a look at this.... and it's never too late!! > > I've tried bumping up work_mem and did not see any improvements - > All the indexes do exist that you asked.... see below.... > Any other ideas? > > CREATE INDEX invsnsr_idx1 > ON invsensor > USING btree > (granule_id); > > CREATE INDEX invsnsr_idx2 > ON invsensor > USING btree > (sensor_id); What about a composite index on both columns? > CREATE UNIQUE INDEX granver_idx1 > ON gran_ver > USING btree > (granule_id); It's a bit surprising to me that this isn't getting used. How big are these tables, and how much memory do you have, and what values are you using for seq_page_cost/random_page_cost/effective_cache_size? ...Robert
thanks for taking a look at this.... and it's never too late!! I've tried bumping up work_mem and did not see any improvements - All the indexes do exist that you asked.... see below.... Any other ideas? CREATE INDEX invsnsr_idx1 ON invsensor USING btree (granule_id); CREATE INDEX invsnsr_idx2 ON invsensor USING btree (sensor_id); CREATE UNIQUE INDEX granver_idx1 ON gran_ver USING btree (granule_id); thanks for the look - Maria Wilson NASA/Langley Research Center Hampton, Virginia 23681 m.l.wilson@nasa.gov On 5/10/11 1:38 PM, Robert Haas wrote: > On Tue, Apr 5, 2011 at 3:25 PM, Maria L. Wilson > <Maria.L.Wilson-1@nasa.gov> wrote: >> Would really appreciate someone taking a look at the query below.... Thanks >> in advance! >> >> >> this is on a linux box... >> Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 >> EST 2009 x86_64 x86_64 x86_64 GNU/Linux >> >> explain analyze >> select MIN(IV.STRTDATE), MAX(IV.ENDDATE) >> from GRAN_VER GV >> left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR >> INVS >> where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and >> INVS.sensor_id='13' >> >> >> "Aggregate (cost=736364.52..736364.53 rows=1 width=8) (actual >> time=17532.930..17532.930 rows=1 loops=1)" >> " -> Hash Join (cost=690287.33..734679.77 rows=336949 width=8) (actual >> time=13791.593..17323.080 rows=924675 loops=1)" >> " Hash Cond: (invs.granule_id = gv.granule_id)" >> " -> Seq Scan on invsensor invs (cost=0.00..36189.41 rows=1288943 >> width=4) (actual time=0.297..735.375 rows=1277121 loops=1)" >> " Filter: (sensor_id = 13)" >> " -> Hash (cost=674401.52..674401.52 rows=1270865 width=16) (actual >> time=13787.698..13787.698 rows=1270750 loops=1)" >> " -> Hash Join (cost=513545.62..674401.52 rows=1270865 >> width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)" >> " Hash Cond: (gv.granule_id = iv.granule_id)" >> " -> Seq Scan on gran_ver gv (cost=0.00..75224.90 >> rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)" >> " -> Hash (cost=497659.81..497659.81 rows=1270865 >> width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)" >> " -> Bitmap Heap Scan on inventory iv >> (cost=24050.00..497659.81 rows=1270865 width=12) (actual >> time=253.542..1387.957 rows=1270750 loops=1)" >> " Recheck Cond: (inv_id = 65)" >> " -> Bitmap Index Scan on inven_idx1 >> (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 >> rows=1270977 loops=1)" >> " Index Cond: (inv_id = 65)" >> "Total runtime: 17533.100 ms" >> >> some additional info..... >> the table inventory is about 4481 MB and also has postgis types. >> the table gran_ver is about 523 MB >> the table INVSENSOR is about 217 MB >> >> the server itself has 32G RAM with the following set in the postgres conf >> shared_buffers = 3GB >> work_mem = 64MB >> maintenance_work_mem = 512MB >> wal_buffers = 6MB >> >> let me know if I've forgotten anything! thanks a bunch!! > Late response here, but... > > Is there an index on invsensor (sensor_id, granule_id)? If not, that > might be something to try. If so, you might want to try to figure out > why it's not being used. > > Likewise, is there an index on gran_ver (granule_id)? >
haven't tested a composite index invsensor is 2,003,980 rows and 219MB granver is 5,138,730 rows and 556MB the machine has 32G memory seq_page_cost, random_page_costs & effective_cache_size are set to the defaults (1,4, and 128MB) - looks like they could be bumped up. Got any recommendations? Maria On 5/10/11 1:59 PM, Robert Haas wrote: > [ woops, accidentally replied off-list, trying again ] > > On Tue, May 10, 2011 at 1:47 PM, Maria L. Wilson > <Maria.L.Wilson-1@nasa.gov> wrote: >> thanks for taking a look at this.... and it's never too late!! >> >> I've tried bumping up work_mem and did not see any improvements - >> All the indexes do exist that you asked.... see below.... >> Any other ideas? >> >> CREATE INDEX invsnsr_idx1 >> ON invsensor >> USING btree >> (granule_id); >> >> CREATE INDEX invsnsr_idx2 >> ON invsensor >> USING btree >> (sensor_id); > What about a composite index on both columns? > >> CREATE UNIQUE INDEX granver_idx1 >> ON gran_ver >> USING btree >> (granule_id); > It's a bit surprising to me that this isn't getting used. How big are > these tables, and how much memory do you have, and what values are you > using for seq_page_cost/random_page_cost/effective_cache_size? > > ...Robert
On Tue, Apr 5, 2011 at 1:25 PM, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote: This bit: > left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR > INVS has both an explicit and an implicit join. This can constrain join re-ordering in the planner. Can you change it to explicit joins only and see if that helps?
On Tue, May 10, 2011 at 2:20 PM, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote: > haven't tested a composite index > > invsensor is 2,003,980 rows and 219MB > granver is 5,138,730 rows and 556MB > the machine has 32G memory > seq_page_cost, random_page_costs & effective_cache_size are set to the > defaults (1,4, and 128MB) - looks like they could be bumped up. > Got any recommendations? Yeah, I'd try setting effective_cache_size=24GB, seq_page_cost=0.1, random_page_cost=0.1 and see if you get a better plan. If possible, can you post the EXPLAIN ANALYZE output with those settings for us? If that doesn't cause the planner to use the indexes, then I'd be suspicious that there is something wrong with those indexes that makes the planner think it *can't* use them. It would be helpful to see the EXPLAIN output after SET enable_seqscan=off. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Scott Marlowe <scott.marlowe@gmail.com> writes: > On Tue, Apr 5, 2011 at 1:25 PM, Maria L. Wilson > <Maria.L.Wilson-1@nasa.gov> wrote: > This bit: >> left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR >> INVS > has both an explicit and an implicit join. This can constrain join > re-ordering in the planner. Can you change it to explicit joins only > and see if that helps? Since there's a WHERE constraint on IV, the outer join is going to be strength-reduced to an inner join (note the lack of any outer joins in the plan). So that isn't going to matter. AFAICS this is just plain an expensive query. The two filter constraints are not very selective, each passing more than a million rows up to the join. You can't expect to join millions of rows in no time flat. About all you can do is try to bump up work_mem enough that the join won't use temp files --- for something like this, that's likely to require a setting of hundreds of MB. I'm not sure whether Maria is using a version in which EXPLAIN ANALYZE will show whether a hash join was batched, but that's what I'd be looking at. regards, tom lane