Thread: index / sequential scan problem
Hi all, Im currently taking my first steps with db optimizations and am wondering whats happening here and if/how i can help pg choose the better plan. Thanks, Fabian >>> psql (PostgreSQL) 7.2.2 perg_1097=# VACUUM ANALYZE ; VACUUM perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) (actual time=0.28..2305.52 rows=31122 loops=1) SubPlan -> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) (actual time=0.07..0.07 rows=0 loops=31122) Total runtime: 2334.42 msec EXPLAIN perg_1097=# SET enable_seqscan to false; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=100000000.00..100111719.36 rows=15561 width=12) (actual time=0.24..538.86 rows=31122 loops=1) SubPlan -> Index Scan using idx_notiz_gelesen_2 on notiz_gelesen b (cost=0.00..3.57 rows=1 width=0) (actual time=0.01..0.01 rows=0 loops=31122) Total runtime: 570.75 msec EXPLAIN perg_1097=# perg_1097=# \d notiz_objekt; Table "notiz_objekt" Column | Type | Modifiers ----------+---------+----------- notiz_id | integer | obj_id | integer | obj_typ | integer | Indexes: idx_notiz_objekt_1, idx_notiz_objekt_2 perg_1097=# \d notiz_gelesen; Table "notiz_gelesen" Column | Type | Modifiers ----------+--------------------------+---------------------------------------------------- notiz_id | integer | ma_id | integer | ma_pid | integer | stamp | timestamp with time zone | default ('now'::text)::timestamp(6) with time zone anzeigen | character varying | Indexes: idx_notiz_gelesen_1, idx_notiz_gelesen_2 perg_1097=# perg_1097=# select count(*) from notiz_objekt; count ------- 31122 (1 row) perg_1097=# select count(*) from notiz_gelesen; count ------- 45 (1 row) perg_1097=# idx_notiz_gelesen_1 (ma_id,ma_pid) idx_notiz_gelesen_2 (notiz_id)
On 17 Jul 2003 at 11:01, Fabian Kreitner wrote: > psql (PostgreSQL) 7.2.2 > > perg_1097=# VACUUM ANALYZE ; > VACUUM > perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ > perg_1097-# from notiz_objekt a > perg_1097-# where not exists > perg_1097-# ( > perg_1097(# select 1 > perg_1097(# from notiz_gelesen b > perg_1097(# where ma_id = 2001 > perg_1097(# and ma_pid = 1097 > perg_1097(# and a.notiz_id = b.notiz_id > perg_1097(# ) > perg_1097-# ; For 31K records, seq. scan does not sound like a bad plan to me but anyway.. How about where ma_id = 2001::integer and ma_pid = 1097::integer in above query? Bye Shridhar -- No one can guarantee the actions of another. -- Spock, "Day of the Dove", stardate unknown
On 17/07/2003 10:01 Fabian Kreitner wrote: Hi Fabian, When you are doing these kinds of tests, you need to be aware that the kernel may have most of your data cached after the first query and this may be why the second query appears to run faster. Also don't be worried if the planner chooses a seq scan for small tables as the whole table can often be bought into memory with one IO whereas reading the index then the table would be 2 IOs. HTH -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
At 12:12 17.07.2003, you wrote: >On 17/07/2003 10:01 Fabian Kreitner wrote: > >Hi Fabian, > >When you are doing these kinds of tests, you need to be aware that the >kernel may have most of your data cached after the first query and this >may be why the second query appears to run faster. I thought of this too, but executions times wont change with repeating / alternating these two tests. >Also don't be worried if the planner chooses a seq scan for small tables >as the whole table can often be bought into memory with one IO whereas >reading the index then the table would be 2 IOs. HTH That is what I read too and is why Im confused that the index is indeed executing faster. Can this be a problem with the hardware and/or postgress installation? Thanks, Fabian
At 11:17 17.07.2003, Shridhar Daithankar wrote: >On 17 Jul 2003 at 11:01, Fabian Kreitner wrote: > > psql (PostgreSQL) 7.2.2 > > > > perg_1097=# VACUUM ANALYZE ; > > VACUUM > > perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ > > perg_1097-# from notiz_objekt a > > perg_1097-# where not exists > > perg_1097-# ( > > perg_1097(# select 1 > > perg_1097(# from notiz_gelesen b > > perg_1097(# where ma_id = 2001 > > perg_1097(# and ma_pid = 1097 > > perg_1097(# and a.notiz_id = b.notiz_id > > perg_1097(# ) > > perg_1097-# ; > >For 31K records, seq. scan does not sound like a bad plan to me but anyway.. Im not generally worried that it uses a seq scan but that the second example (where an index on the sub select is used on a table with only 45 entries) executes more than 4 times faster. Its not a cache thing either, since i can enable seqscan again and it will run with 2300ms again. >How about > > where ma_id = 2001::integer >and ma_pid = 1097::integer > >in above query? I dont really understand in what way this will help the planner but ill try. Thanks, Fabian
On 17 Jul 2003 at 13:12, Fabian Kreitner wrote: > At 11:17 17.07.2003, Shridhar Daithankar wrote: > >How about > > > > where ma_id = 2001::integer > >and ma_pid = 1097::integer > > > >in above query? > > I dont really understand in what way this will help the planner but ill try. That is typecasting. It helps planner understand query in more correct fashion. Bye Shridhar -- QOTD: "I may not be able to walk, but I drive from the sitting posistion."
On 17/07/2003 12:13 Fabian Kreitner wrote: > That is what I read too and is why Im confused that the index is indeed > executing faster. Can this be a problem with the hardware and/or > postgress installation? It's more likely that the OS has most of the data cached after the first query and so doesn't need to re-read that data from disk when you retry the query with seq scan disabled. Try something like this: set enable_seqscan to true; explain analyze ...... set enable_seqscan to false; explain analyze ...... set enable_seqscan to true; explain analyze ...... I expect you will find that the third query is also a lot faster that the first query. HTH -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
At 14:34 17.07.2003, you wrote: >On 17/07/2003 12:13 Fabian Kreitner wrote: >>That is what I read too and is why Im confused that the index is indeed >>executing faster. Can this be a problem with the hardware and/or >>postgress installation? > > >It's more likely that the OS has most of the data cached after the first >query and so doesn't need to re-read that data from disk when you retry >the query with seq scan disabled. Try something like this: > >set enable_seqscan to true; >explain analyze ...... >set enable_seqscan to false; >explain analyze ...... >set enable_seqscan to true; >explain analyze ...... > >I expect you will find that the third query is also a lot faster that the >first query. Im afraid, no. Database has been stopped / started right before this. perg_1097=# set enable_seqscan to true; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) (actual time=0.28..2298.71 rows=31122 loops=1) SubPlan -> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) (actual time=0.07..0.07 rows=0 loops=31122) Total runtime: 2327.37 msec EXPLAIN perg_1097=# set enable_seqscan to false; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=100000000.00..100111719.36 rows=15561 width=12) (actual time=0.25..535.75 rows=31122 loops=1) SubPlan -> Index Scan using idx_notiz_gelesen_2 on notiz_gelesen b (cost=0.00..3.57 rows=1 width=0) (actual time=0.01..0.01 rows=0 loops=31122) Total runtime: 567.94 msec EXPLAIN perg_1097=# set enable_seqscan to true; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) (actual time=0.13..2300.74 rows=31122 loops=1) SubPlan -> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) (actual time=0.07..0.07 rows=0 loops=31122) Total runtime: 2330.25 msec EXPLAIN perg_1097=#
On 17 Jul 2003 at 14:50, Fabian Kreitner wrote: > At 14:34 17.07.2003, you wrote: > >I expect you will find that the third query is also a lot faster that the > >first query. > > Im afraid, no. > Database has been stopped / started right before this. > > perg_1097=# set enable_seqscan to true; > SET VARIABLE > perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ > perg_1097-# from notiz_objekt a > perg_1097-# where not exists Well, he said query and not the query plan...:-) While explain analyze is great for judging what is happening, it's not always a good idea to trust the numbers produced by it. It will probably produce same number for a SCSI disk machine and for a IDE disk machine, everything else being equal. Obviously that is not correct. Only thing you can trust in explain analyze is it's plan. Numbers are based on heuristic and should be taken as hint only. Bye Shridhar -- Harrisberger's Fourth Law of the Lab: Experience is directly proportional to the amount of equipment ruined.
I've seen similar behavior in my own queries. I found that reducing random_page_cost from the default of 4 down to 2 caused the query to choose the index, and resulted in an order of magnitude improvement on some queries. On Thu, 2003-07-17 at 05:50, Fabian Kreitner wrote: > At 14:34 17.07.2003, you wrote: > > >On 17/07/2003 12:13 Fabian Kreitner wrote: > >>That is what I read too and is why Im confused that the index is indeed > >>executing faster. Can this be a problem with the hardware and/or > >>postgress installation? > > > > > >It's more likely that the OS has most of the data cached after the first > >query and so doesn't need to re-read that data from disk when you retry > >the query with seq scan disabled. Try something like this: > > > >set enable_seqscan to true; > >explain analyze ...... > >set enable_seqscan to false; > >explain analyze ...... > >set enable_seqscan to true; > >explain analyze ...... > > > >I expect you will find that the third query is also a lot faster that the > >first query. > > Im afraid, no. > Database has been stopped / started right before this. > > perg_1097=# set enable_seqscan to true; > SET VARIABLE > perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ > perg_1097-# from notiz_objekt a > perg_1097-# where not exists > perg_1097-# ( > perg_1097(# select 1 > perg_1097(# from notiz_gelesen b > perg_1097(# where ma_id = 2001 > perg_1097(# and ma_pid = 1097 > perg_1097(# and a.notiz_id = b.notiz_id > perg_1097(# ) > perg_1097-# ; > NOTICE: QUERY PLAN: > > Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) > (actual time=0.28..2298.71 rows=31122 loops=1) > SubPlan > -> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) > (actual time=0.07..0.07 rows=0 loops=31122) > Total runtime: 2327.37 msec > > EXPLAIN > perg_1097=# set enable_seqscan to false; > SET VARIABLE > perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ > perg_1097-# from notiz_objekt a > perg_1097-# where not exists > perg_1097-# ( > perg_1097(# select 1 > perg_1097(# from notiz_gelesen b > perg_1097(# where ma_id = 2001 > perg_1097(# and ma_pid = 1097 > perg_1097(# and a.notiz_id = b.notiz_id > perg_1097(# ) > perg_1097-# ; > NOTICE: QUERY PLAN: > > Seq Scan on notiz_objekt a (cost=100000000.00..100111719.36 rows=15561 > width=12) (actual time=0.25..535.75 rows=31122 loops=1) > SubPlan > -> Index Scan using idx_notiz_gelesen_2 on notiz_gelesen > b (cost=0.00..3.57 rows=1 width=0) (actual time=0.01..0.01 rows=0 loops=31122) > Total runtime: 567.94 msec > > EXPLAIN > perg_1097=# set enable_seqscan to true; > SET VARIABLE > perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ > perg_1097-# from notiz_objekt a > perg_1097-# where not exists > perg_1097-# ( > perg_1097(# select 1 > perg_1097(# from notiz_gelesen b > perg_1097(# where ma_id = 2001 > perg_1097(# and ma_pid = 1097 > perg_1097(# and a.notiz_id = b.notiz_id > perg_1097(# ) > perg_1097-# ; > NOTICE: QUERY PLAN: > > Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) > (actual time=0.13..2300.74 rows=31122 loops=1) > SubPlan > -> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) > (actual time=0.07..0.07 rows=0 loops=31122) > Total runtime: 2330.25 msec > > EXPLAIN > perg_1097=# > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Jord Tanner <jord@indygecko.com>
On 17/07/2003 13:50 Fabian Kreitner wrote: > [snip] > Im afraid, no. > Database has been stopped / started right before this. > [snip] 1) enable_seqscan = true > Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) > (actual time=0.28..2298.71 rows=31122 loops=1) > [snip] 2) enable_seqscan = false > Seq Scan on notiz_objekt a (cost=100000000.00..100111719.36 rows=15561 > width=12) (actual time=0.25..535.75 rows=31122 loops=1) I've just noticed this. Something is not right here. Look at the crazy cost estimation for the second query. It looks to me like enable_indexscan, enable_tidscan, enable_sort, enable_nestloop, enable_mergejoin or enable_hashjoin have been set to false. Looking at the source, thats the only way I can see that such large numbers can be produced. HTH -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Paul Thomas <paul@tmsl.demon.co.uk> writes: > 2) enable_seqscan = false >> Seq Scan on notiz_objekt a (cost=100000000.00..100111719.36 rows=15561 >> width=12) (actual time=0.25..535.75 rows=31122 loops=1) > I've just noticed this. Something is not right here. Look at the crazy > cost estimation for the second query. No, that's exactly what it's supposed to do. enable_seqscan cannot simply suppress generation of a seqscan plan (because that might be the only way to do the query, if there's no applicable index). So it generates the plan, but sticks a large penalty into the cost estimate to keep the planner from choosing that alternative if there is any other. The "100000000.00" is that artificial penalty. We could probably hide this implementation detail from you if we tried hard enough, but it hasn't bothered anyone enough to try. regards, tom lane
Fabian Kreitner <fabian.kreitner@ainea-ag.de> writes: > That is what I read too and is why Im confused that the index is indeed > executing faster. Can this be a problem with the hardware and/or postgress > installation? I think the actual issue here is that you are executing the EXISTS subplan over and over, once for each outer row. The planner's cost estimate for EXISTS is based on the assumption that you do it once ... in which scenario the seqscan very possibly is cheaper. However, when you do the EXISTS subplan over and over for many outer rows, you get a savings from the fact that the index and table pages soon get cached in memory. The seqscan plan gets a savings too, since the table is small enough to fit in memory, but once everything is in memory the indexscan plan is faster. There's been some discussion on pghackers about how to teach the planner to account for repeated executions of subplans, but we have not come up with a good solution yet. For the moment, what people tend to do if they know their database is small enough to mostly stay in memory is to reduce random_page_cost to make the planner favor indexscans. If you know the database is entirely cached then the theoretically correct value of random_page_cost is 1.0 (since fetching any page will cost the same, if it's all in RAM). I'd recommend against adopting that as a default, but a lot of people find that setting it to 2.0 or so seems to model their situation better than the out-of-the-box 4.0. regards, tom lane
At 20:12 17.07.2003, Tom Lane wrote: >Fabian Kreitner <fabian.kreitner@ainea-ag.de> writes: > > That is what I read too and is why Im confused that the index is indeed > > executing faster. Can this be a problem with the hardware and/or postgress > > installation? > >I think the actual issue here is that you are executing the EXISTS >subplan over and over, once for each outer row. The planner's cost >estimate for EXISTS is based on the assumption that you do it once >... in which scenario the seqscan very possibly is cheaper. However, >when you do the EXISTS subplan over and over for many outer rows, you >get a savings from the fact that the index and table pages soon get >cached in memory. The seqscan plan gets a savings too, since the table >is small enough to fit in memory, but once everything is in memory the >indexscan plan is faster. > >There's been some discussion on pghackers about how to teach the planner >to account for repeated executions of subplans, but we have not come up >with a good solution yet. > >For the moment, what people tend to do if they know their database is >small enough to mostly stay in memory is to reduce random_page_cost to >make the planner favor indexscans. If you know the database is entirely >cached then the theoretically correct value of random_page_cost is 1.0 >(since fetching any page will cost the same, if it's all in RAM). I'd >recommend against adopting that as a default, but a lot of people find >that setting it to 2.0 or so seems to model their situation better than >the out-of-the-box 4.0. Thanks for the explanation :) However .... :( perg_1097=# vacuum analyze; VACUUM perg_1097=# set random_page_cost to 1.0; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) (actual time=0.27..2299.09 rows=31122 loops=1) SubPlan -> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) (actual time=0.07..0.07 rows=0 loops=31122) Total runtime: 2328.05 msec EXPLAIN perg_1097=# ... perg_1097=# set enable_seqscan to false; SET VARIABLE perg_1097=# set random_page_cost to 1.0; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN: Seq Scan on notiz_objekt a (cost=100000000.00..100093380.36 rows=15561 width=12) (actual time=0.07..550.07 rows=31122 loops=1) SubPlan -> Index Scan using idx_notiz_gelesen_2 on notiz_gelesen b (cost=0.00..2.98 rows=1 width=0) (actual time=0.01..0.01 rows=0 loops=31122) Total runtime: 582.90 msec EXPLAIN perg_1097=# Even with a random page cost of 1 it thinks using the index should/could take significantly longer which it doesnt for some reason :-/
Hi all, Adjusting the cpu_tuple_cost to 0.042 got the planner to choose the index. Anything I need to consider when raising it to such "high" values? Thanks for the help, Fabian
On Fri, 18 Jul 2003, Fabian Kreitner wrote: > Adjusting the cpu_tuple_cost to 0.042 got the planner to choose the index. Doesn't sound very good and it will most likely make other queries slower. You could always turn off sequential scan before that query and turn it on after. > Anything I need to consider when raising it to such "high" values? You could fill the table with more data and it will probably come to a point where it will stop using the seq. scan. You could of course also change pg itself so it calculates a better estimate. -- /Dennis
=?ISO-8859-1?Q?Dennis_Bj=F6rklund?= <db@zigo.dhs.org> writes: > On Fri, 18 Jul 2003, Fabian Kreitner wrote: >> Adjusting the cpu_tuple_cost to 0.042 got the planner to choose the index. > Doesn't sound very good and it will most likely make other queries slower. Seems like a reasonable approach to me --- certainly better than setting random_page_cost to physically nonsensical values. In a fully-cached situation it's entirely reasonable to inflate the various cpu_xxx costs, since by assumption you are not paying the normal price of physical disk I/O. Fetching a page from kernel buffer cache is certainly cheaper than getting it off the disk. But the CPU costs involved in processing the page contents don't change. Since our cost unit is defined as 1.0 = one sequential page fetch, you have to increase the cpu_xxx numbers instead of reducing the I/O cost estimate. I would recommend inflating all the cpu_xxx costs by the same factor, unless you have evidence that they are wrong in relation to each other. regards, tom lane
On Fri, 18 Jul 2003, Tom Lane wrote: > =?ISO-8859-1?Q?Dennis_Bj=F6rklund?= <db@zigo.dhs.org> writes: > > On Fri, 18 Jul 2003, Fabian Kreitner wrote: > >> Adjusting the cpu_tuple_cost to 0.042 got the planner to choose the index. > > > Doesn't sound very good and it will most likely make other queries slower. > > Seems like a reasonable approach to me --- certainly better than setting > random_page_cost to physically nonsensical values. > > In a fully-cached situation it's entirely reasonable to inflate the > various cpu_xxx costs, since by assumption you are not paying the normal > price of physical disk I/O. Fetching a page from kernel buffer cache > is certainly cheaper than getting it off the disk. But the CPU costs > involved in processing the page contents don't change. Since our cost > unit is defined as 1.0 = one sequential page fetch, you have to increase > the cpu_xxx numbers instead of reducing the I/O cost estimate. > > I would recommend inflating all the cpu_xxx costs by the same factor, > unless you have evidence that they are wrong in relation to each other. And don't forget to set effective_cache_size. It's the one I missed for the longest when I started.
On Fri, 18 Jul 2003, Tom Lane wrote: > >> Adjusting the cpu_tuple_cost to 0.042 got the planner to choose the index. > > > Doesn't sound very good and it will most likely make other queries slower. > > Seems like a reasonable approach to me --- certainly better than setting > random_page_cost to physically nonsensical values. Hehe, just before this letter there was talk about changing random_page_cost. I kind of responed that 0.042 is not a good random page cost. But now of course I can see that it says cpu_tuple_cost :-) Sorry for adding confusion. -- /Dennis