Thread: help speeding up a query in postgres 8.4.5

help speeding up a query in postgres 8.4.5

From
"Maria L. Wilson"
Date:
Would really appreciate someone taking a look at the query below....  Thanks in advance!


this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'


"Aggregate  (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
"  ->  Hash Join  (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
"        Hash Cond: (invs.granule_id = gv.granule_id)"
"        ->  Seq Scan on invsensor invs  (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
"              Filter: (sensor_id = 13)"
"        ->  Hash  (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
"              ->  Hash Join  (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
"                    Hash Cond: (gv.granule_id = iv.granule_id)"
"                    ->  Seq Scan on gran_ver gv  (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
"                    ->  Hash  (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
"                          ->  Bitmap Heap Scan on inventory iv  (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
"                                Recheck Cond: (inv_id = 65)"
"                                ->  Bitmap Index Scan on inven_idx1  (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
"                                      Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"

some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB

the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB                      
maintenance_work_mem = 512MB       
wal_buffers = 6MB

let me know if I've forgotten anything!  thanks a bunch!!

Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.gov




Re: help speeding up a query in postgres 8.4.5

From
Szymon Guz
Date:


On 5 April 2011 21:25, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote:
Would really appreciate someone taking a look at the query below....  Thanks in advance!


this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'


"Aggregate  (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
"  ->  Hash Join  (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
"        Hash Cond: (invs.granule_id = gv.granule_id)"
"        ->  Seq Scan on invsensor invs  (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
"              Filter: (sensor_id = 13)"
"        ->  Hash  (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
"              ->  Hash Join  (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
"                    Hash Cond: (gv.granule_id = iv.granule_id)"
"                    ->  Seq Scan on gran_ver gv  (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
"                    ->  Hash  (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
"                          ->  Bitmap Heap Scan on inventory iv  (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
"                                Recheck Cond: (inv_id = 65)"
"                                ->  Bitmap Index Scan on inven_idx1  (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
"                                      Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"

some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB

the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB                      
maintenance_work_mem = 512MB       
wal_buffers = 6MB

let me know if I've forgotten anything!  thanks a bunch!!

Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.gov



Hi,
could you show us indexes that you have on all tables from this query? Have you tried running vacuum analyze on those tables? Do you have autovacuum active?

regards
Szymon

Re: help speeding up a query in postgres 8.4.5

From
tv@fuzzy.cz
Date:
> some additional info.....
> the table inventory is about 4481 MB and also has postgis types.
> the table gran_ver is about 523 MB
> the table INVSENSOR is about 217 MB
>
> the server itself has 32G RAM with the following set in the postgres conf
> shared_buffers = 3GB
> work_mem = 64MB
> maintenance_work_mem = 512MB
> wal_buffers = 6MB

Not sure how to improve the query itself - it's rather simple and the
execution plan seems reasonable. You're dealing with a lot of data, so it
takes time to process.

Anyway, I'd try to bump up the shared buffers a bit (the tables you've
listed have about 5.5 GB, so 3GB of shared buffers won't cover it). OTOH
most of the data will be in pagecache maintained by the kernel anyway.

Try to increase the work_mem a bit, that might speed up the hash joins
(the two hash joins consumed about 15s, the whole query took 17s). This
does not require a restart, just do

set work_mem = '128MB'

(or 256MB) and then run the query in the same session. Let's see if that
works.

regards
Tomas


Re: help speeding up a query in postgres 8.4.5

From
"Maria L. Wilson"
Date:
Autovacuum is not running - but regular vacuums are being done twice daily.

indexes on inventory:

CREATE INDEX inven_idx1
  ON inventory
  USING btree
  (inv_id);

CREATE UNIQUE INDEX inven_idx2
  ON inventory
  USING btree
  (granule_id);

indexes on gran_ver:
CREATE UNIQUE INDEX granver_idx1
  ON gran_ver
  USING btree
  (granule_id);

indexes on sensor
CREATE INDEX invsnsr_idx2
  ON invsensor
  USING btree
  (sensor_id);




On 4/6/11 7:41 AM, Szymon Guz wrote:


On 5 April 2011 21:25, Maria L. Wilson <Maria.L.Wilson-1@nasa.gov> wrote:
Would really appreciate someone taking a look at the query below....  Thanks in advance!


this is on a linux box...
Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

explain analyze
select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
from GRAN_VER GV
left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR INVS
where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and INVS.sensor_id='13'


"Aggregate  (cost=736364.52..736364.53 rows=1 width=8) (actual time=17532.930..17532.930 rows=1 loops=1)"
"  ->  Hash Join  (cost=690287.33..734679.77 rows=336949 width=8) (actual time=13791.593..17323.080 rows=924675 loops=1)"
"        Hash Cond: (invs.granule_id = gv.granule_id)"
"        ->  Seq Scan on invsensor invs  (cost=0.00..36189.41 rows=1288943 width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
"              Filter: (sensor_id = 13)"
"        ->  Hash  (cost=674401.52..674401.52 rows=1270865 width=16) (actual time=13787.698..13787.698 rows=1270750 loops=1)"
"              ->  Hash Join  (cost=513545.62..674401.52 rows=1270865 width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
"                    Hash Cond: (gv.granule_id = iv.granule_id)"
"                    ->  Seq Scan on gran_ver gv  (cost=0.00..75224.90 rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
"                    ->  Hash  (cost=497659.81..497659.81 rows=1270865 width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
"                          ->  Bitmap Heap Scan on inventory iv  (cost=24050.00..497659.81 rows=1270865 width=12) (actual time=253.542..1387.957 rows=1270750 loops=1)"
"                                Recheck Cond: (inv_id = 65)"
"                                ->  Bitmap Index Scan on inven_idx1  (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364 rows=1270977 loops=1)"
"                                      Index Cond: (inv_id = 65)"
"Total runtime: 17533.100 ms"

some additional info.....
the table inventory is about 4481 MB and also has postgis types.
the table gran_ver is about 523 MB
the table INVSENSOR is about 217 MB

the server itself has 32G RAM with the following set in the postgres conf
shared_buffers = 3GB
work_mem = 64MB                      
maintenance_work_mem = 512MB       
wal_buffers = 6MB

let me know if I've forgotten anything!  thanks a bunch!!

Maria Wilson
NASA/Langley Research Center
Hampton, Virginia
m.l.wilson@nasa.gov



Hi,
could you show us indexes that you have on all tables from this query? Have you tried running vacuum analyze on those tables? Do you have autovacuum active?

regards
Szymon

Re: help speeding up a query in postgres 8.4.5

From
"Maria L. Wilson"
Date:
thanks for the reply, Tomas.  I'll test bumping up work_mem and see how
that helps.....

thanks again,  Maria Wilson

On 4/6/11 9:16 AM, tv@fuzzy.cz wrote:
>> some additional info.....
>> the table inventory is about 4481 MB and also has postgis types.
>> the table gran_ver is about 523 MB
>> the table INVSENSOR is about 217 MB
>>
>> the server itself has 32G RAM with the following set in the postgres conf
>> shared_buffers = 3GB
>> work_mem = 64MB
>> maintenance_work_mem = 512MB
>> wal_buffers = 6MB
> Not sure how to improve the query itself - it's rather simple and the
> execution plan seems reasonable. You're dealing with a lot of data, so it
> takes time to process.
>
> Anyway, I'd try to bump up the shared buffers a bit (the tables you've
> listed have about 5.5 GB, so 3GB of shared buffers won't cover it). OTOH
> most of the data will be in pagecache maintained by the kernel anyway.
>
> Try to increase the work_mem a bit, that might speed up the hash joins
> (the two hash joins consumed about 15s, the whole query took 17s). This
> does not require a restart, just do
>
> set work_mem = '128MB'
>
> (or 256MB) and then run the query in the same session. Let's see if that
> works.
>
> regards
> Tomas
>

Re: help speeding up a query in postgres 8.4.5

From
"Kevin Grittner"
Date:
"Maria L. Wilson" <Maria.L.Wilson-1@nasa.gov> wrote:

> Autovacuum is not running - but regular vacuums are being done
> twice daily.

Is the ANALYZE keyword used on those VACUUM runs?  What version of
PostgreSQL is this.  If it's enough to need fsm settings, do you run
with the VERBOSE option and check the end of the output to make sure
they are set high enough?

-Kevin

Re: help speeding up a query in postgres 8.4.5

From
Tomas Vondra
Date:
Dne 6.4.2011 17:33, Kevin Grittner napsal(a):
> "Maria L. Wilson" <Maria.L.Wilson-1@nasa.gov> wrote:
>
>> Autovacuum is not running - but regular vacuums are being done
>> twice daily.
>
> Is the ANALYZE keyword used on those VACUUM runs?  What version of
> PostgreSQL is this.  If it's enough to need fsm settings, do you run
> with the VERBOSE option and check the end of the output to make sure
> they are set high enough?

Why do you think the problem is related to stale stats? It seems to me
fairly accurate - see the explain analyze in the first post). All the
nodes are less than 1% off (which is great), except for the last hash
join that returns 336949 rows instead of 924675 expected rows.

Maybe I'm missing something, but the stats seem to be quite accurate and
there is just very little dead tuples I guess.

regards
Tomas

Re: help speeding up a query in postgres 8.4.5

From
"Maria L. Wilson"
Date:
yep - we use analyze and check the output.  It's version 8.4.5 so no fsm
issues.

thanks,  Maria

On 4/6/11 11:33 AM, Kevin Grittner wrote:
> "Maria L. Wilson"<Maria.L.Wilson-1@nasa.gov>  wrote:
>
>> Autovacuum is not running - but regular vacuums are being done
>> twice daily.
>
> Is the ANALYZE keyword used on those VACUUM runs?  What version of
> PostgreSQL is this.  If it's enough to need fsm settings, do you run
> with the VERBOSE option and check the end of the output to make sure
> they are set high enough?
>
> -Kevin

Re: help speeding up a query in postgres 8.4.5

From
Robert Haas
Date:
On Tue, Apr 5, 2011 at 3:25 PM, Maria L. Wilson
<Maria.L.Wilson-1@nasa.gov> wrote:
> Would really appreciate someone taking a look at the query below....  Thanks
> in advance!
>
>
> this is on a linux box...
> Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37
> EST 2009 x86_64 x86_64 x86_64 GNU/Linux
>
> explain analyze
> select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
> from GRAN_VER GV
> left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR
> INVS
> where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and
> INVS.sensor_id='13'
>
>
> "Aggregate  (cost=736364.52..736364.53 rows=1 width=8) (actual
> time=17532.930..17532.930 rows=1 loops=1)"
> "  ->  Hash Join  (cost=690287.33..734679.77 rows=336949 width=8) (actual
> time=13791.593..17323.080 rows=924675 loops=1)"
> "        Hash Cond: (invs.granule_id = gv.granule_id)"
> "        ->  Seq Scan on invsensor invs  (cost=0.00..36189.41 rows=1288943
> width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
> "              Filter: (sensor_id = 13)"
> "        ->  Hash  (cost=674401.52..674401.52 rows=1270865 width=16) (actual
> time=13787.698..13787.698 rows=1270750 loops=1)"
> "              ->  Hash Join  (cost=513545.62..674401.52 rows=1270865
> width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
> "                    Hash Cond: (gv.granule_id = iv.granule_id)"
> "                    ->  Seq Scan on gran_ver gv  (cost=0.00..75224.90
> rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
> "                    ->  Hash  (cost=497659.81..497659.81 rows=1270865
> width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
> "                          ->  Bitmap Heap Scan on inventory iv
> (cost=24050.00..497659.81 rows=1270865 width=12) (actual
> time=253.542..1387.957 rows=1270750 loops=1)"
> "                                Recheck Cond: (inv_id = 65)"
> "                                ->  Bitmap Index Scan on inven_idx1
> (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364
> rows=1270977 loops=1)"
> "                                      Index Cond: (inv_id = 65)"
> "Total runtime: 17533.100 ms"
>
> some additional info.....
> the table inventory is about 4481 MB and also has postgis types.
> the table gran_ver is about 523 MB
> the table INVSENSOR is about 217 MB
>
> the server itself has 32G RAM with the following set in the postgres conf
> shared_buffers = 3GB
> work_mem = 64MB
> maintenance_work_mem = 512MB
> wal_buffers = 6MB
>
> let me know if I've forgotten anything!  thanks a bunch!!

Late response here, but...

Is there an index on invsensor (sensor_id, granule_id)?  If not, that
might be something to try.  If so, you might want to try to figure out
why it's not being used.

Likewise, is there an index on gran_ver (granule_id)?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: help speeding up a query in postgres 8.4.5

From
Robert Haas
Date:
[ woops, accidentally replied off-list, trying again ]

On Tue, May 10, 2011 at 1:47 PM, Maria L. Wilson
<Maria.L.Wilson-1@nasa.gov> wrote:
> thanks for taking a look at this....  and it's never too late!!
>
> I've tried bumping up work_mem and did not see any improvements -
> All the indexes do exist that you asked.... see below....
> Any other ideas?
>
> CREATE INDEX invsnsr_idx1
>  ON invsensor
>  USING btree
>  (granule_id);
>
> CREATE INDEX invsnsr_idx2
>  ON invsensor
>  USING btree
>  (sensor_id);

What about a composite index on both columns?

> CREATE UNIQUE INDEX granver_idx1
>  ON gran_ver
>  USING btree
>  (granule_id);

It's a bit surprising to me that this isn't getting used.  How big are
these tables, and how much memory do you have, and what values are you
using for seq_page_cost/random_page_cost/effective_cache_size?

...Robert

Re: help speeding up a query in postgres 8.4.5

From
"Maria L. Wilson"
Date:
thanks for taking a look at this....  and it's never too late!!

I've tried bumping up work_mem and did not see any improvements -
All the indexes do exist that you asked.... see below....
Any other ideas?

CREATE INDEX invsnsr_idx1
   ON invsensor
   USING btree
   (granule_id);

CREATE INDEX invsnsr_idx2
   ON invsensor
   USING btree
   (sensor_id);

CREATE UNIQUE INDEX granver_idx1
   ON gran_ver
   USING btree
   (granule_id);

thanks for the look -
Maria Wilson
NASA/Langley Research Center
Hampton, Virginia 23681
m.l.wilson@nasa.gov

On 5/10/11 1:38 PM, Robert Haas wrote:
> On Tue, Apr 5, 2011 at 3:25 PM, Maria L. Wilson
> <Maria.L.Wilson-1@nasa.gov>  wrote:
>> Would really appreciate someone taking a look at the query below....  Thanks
>> in advance!
>>
>>
>> this is on a linux box...
>> Linux dsrvr201.larc.nasa.gov 2.6.18-164.9.1.el5 #1 SMP Wed Dec 9 03:27:37
>> EST 2009 x86_64 x86_64 x86_64 GNU/Linux
>>
>> explain analyze
>> select MIN(IV.STRTDATE), MAX(IV.ENDDATE)
>> from GRAN_VER GV
>> left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR
>> INVS
>> where IV.INV_ID='65' and GV.GRANULE_ID = INVS.granule_id and
>> INVS.sensor_id='13'
>>
>>
>> "Aggregate  (cost=736364.52..736364.53 rows=1 width=8) (actual
>> time=17532.930..17532.930 rows=1 loops=1)"
>> "  ->   Hash Join  (cost=690287.33..734679.77 rows=336949 width=8) (actual
>> time=13791.593..17323.080 rows=924675 loops=1)"
>> "        Hash Cond: (invs.granule_id = gv.granule_id)"
>> "        ->   Seq Scan on invsensor invs  (cost=0.00..36189.41 rows=1288943
>> width=4) (actual time=0.297..735.375 rows=1277121 loops=1)"
>> "              Filter: (sensor_id = 13)"
>> "        ->   Hash  (cost=674401.52..674401.52 rows=1270865 width=16) (actual
>> time=13787.698..13787.698 rows=1270750 loops=1)"
>> "              ->   Hash Join  (cost=513545.62..674401.52 rows=1270865
>> width=16) (actual time=1998.702..13105.578 rows=1270750 loops=1)"
>> "                    Hash Cond: (gv.granule_id = iv.granule_id)"
>> "                    ->   Seq Scan on gran_ver gv  (cost=0.00..75224.90
>> rows=4861490 width=4) (actual time=0.008..1034.885 rows=4867542 loops=1)"
>> "                    ->   Hash  (cost=497659.81..497659.81 rows=1270865
>> width=12) (actual time=1968.918..1968.918 rows=1270750 loops=1)"
>> "                          ->   Bitmap Heap Scan on inventory iv
>> (cost=24050.00..497659.81 rows=1270865 width=12) (actual
>> time=253.542..1387.957 rows=1270750 loops=1)"
>> "                                Recheck Cond: (inv_id = 65)"
>> "                                ->   Bitmap Index Scan on inven_idx1
>> (cost=0.00..23732.28 rows=1270865 width=0) (actual time=214.364..214.364
>> rows=1270977 loops=1)"
>> "                                      Index Cond: (inv_id = 65)"
>> "Total runtime: 17533.100 ms"
>>
>> some additional info.....
>> the table inventory is about 4481 MB and also has postgis types.
>> the table gran_ver is about 523 MB
>> the table INVSENSOR is about 217 MB
>>
>> the server itself has 32G RAM with the following set in the postgres conf
>> shared_buffers = 3GB
>> work_mem = 64MB
>> maintenance_work_mem = 512MB
>> wal_buffers = 6MB
>>
>> let me know if I've forgotten anything!  thanks a bunch!!
> Late response here, but...
>
> Is there an index on invsensor (sensor_id, granule_id)?  If not, that
> might be something to try.  If so, you might want to try to figure out
> why it's not being used.
>
> Likewise, is there an index on gran_ver (granule_id)?
>

Re: help speeding up a query in postgres 8.4.5

From
"Maria L. Wilson"
Date:
haven't tested a composite index

invsensor is 2,003,980 rows and 219MB
granver is 5,138,730 rows and 556MB
the machine has 32G memory
seq_page_cost, random_page_costs & effective_cache_size are set to the
defaults (1,4, and 128MB) - looks like they could be bumped up.
Got any recommendations?

Maria

On 5/10/11 1:59 PM, Robert Haas wrote:
> [ woops, accidentally replied off-list, trying again ]
>
> On Tue, May 10, 2011 at 1:47 PM, Maria L. Wilson
> <Maria.L.Wilson-1@nasa.gov>  wrote:
>> thanks for taking a look at this....  and it's never too late!!
>>
>> I've tried bumping up work_mem and did not see any improvements -
>> All the indexes do exist that you asked.... see below....
>> Any other ideas?
>>
>> CREATE INDEX invsnsr_idx1
>>   ON invsensor
>>   USING btree
>>   (granule_id);
>>
>> CREATE INDEX invsnsr_idx2
>>   ON invsensor
>>   USING btree
>>   (sensor_id);
> What about a composite index on both columns?
>
>> CREATE UNIQUE INDEX granver_idx1
>>   ON gran_ver
>>   USING btree
>>   (granule_id);
> It's a bit surprising to me that this isn't getting used.  How big are
> these tables, and how much memory do you have, and what values are you
> using for seq_page_cost/random_page_cost/effective_cache_size?
>
> ...Robert

Re: help speeding up a query in postgres 8.4.5

From
Scott Marlowe
Date:
On Tue, Apr 5, 2011 at 1:25 PM, Maria L. Wilson
<Maria.L.Wilson-1@nasa.gov> wrote:

This bit:

> left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR
> INVS

has both an explicit and an implicit join.  This can constrain join
re-ordering in the planner.  Can you change it to explicit joins only
and see if that helps?

Re: help speeding up a query in postgres 8.4.5

From
Robert Haas
Date:
On Tue, May 10, 2011 at 2:20 PM, Maria L. Wilson
<Maria.L.Wilson-1@nasa.gov> wrote:
> haven't tested a composite index
>
> invsensor is 2,003,980 rows and 219MB
> granver is 5,138,730 rows and 556MB
> the machine has 32G memory
> seq_page_cost, random_page_costs & effective_cache_size are set to the
> defaults (1,4, and 128MB) - looks like they could be bumped up.
> Got any recommendations?

Yeah, I'd try setting effective_cache_size=24GB, seq_page_cost=0.1,
random_page_cost=0.1 and see if you get a better plan.  If possible,
can you post the EXPLAIN ANALYZE output with those settings for us?

If that doesn't cause the planner to use the indexes, then I'd be
suspicious that there is something wrong with those indexes that makes
the planner think it *can't* use them.  It would be helpful to see the
EXPLAIN output after SET enable_seqscan=off.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: help speeding up a query in postgres 8.4.5

From
Tom Lane
Date:
Scott Marlowe <scott.marlowe@gmail.com> writes:
> On Tue, Apr 5, 2011 at 1:25 PM, Maria L. Wilson
> <Maria.L.Wilson-1@nasa.gov> wrote:
> This bit:

>> left outer join INVENTORY IV on GV.GRANULE_ID = IV.GRANULE_ID, INVSENSOR
>> INVS

> has both an explicit and an implicit join.  This can constrain join
> re-ordering in the planner.  Can you change it to explicit joins only
> and see if that helps?

Since there's a WHERE constraint on IV, the outer join is going to be
strength-reduced to an inner join (note the lack of any outer joins in
the plan).  So that isn't going to matter.

AFAICS this is just plain an expensive query.  The two filter
constraints are not very selective, each passing more than a million
rows up to the join.  You can't expect to join millions of rows in no
time flat.  About all you can do is try to bump up work_mem enough that
the join won't use temp files --- for something like this, that's likely
to require a setting of hundreds of MB.  I'm not sure whether Maria is
using a version in which EXPLAIN ANALYZE will show whether a hash join
was batched, but that's what I'd be looking at.

            regards, tom lane