Thread: Understanding sequential versus index scans.
Hi. I notice that when I do a WHERE x, Postgres uses an index, and when I do WHERE y, it does so as well, but when I do WHERE x OR y, it doesn't. Why is this so? And how can I shut this off?
select * from dict
where
word in (select substr('moon', 0, generate_series(3,length('moon')))) -- this is my X above
OR word like 'moon%' -- this is my Y above
where
word in (select substr('moon', 0, generate_series(3,length('moon')))) -- this is my X above
OR word like 'moon%' -- this is my Y above
Seq Scan on dict (cost=0.02..2775.66 rows=30422 width=24) (actual time=16.635..28.580 rows=8 loops=1)
Filter: ((hashed subplan) OR ((word)::text ~~ 'moon%'::text))
SubPlan
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.014..0.019 rows=2 loops=1)
Total runtime: 28.658 ms
(Using just X or Y alone uses the index, and completes in 0.150 ms)
Is this a bug?
PS Running "PostgreSQL 8.2.1 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special)"
On Sun, Jul 19, 2009 at 6:58 PM, Robert James <srobertjames@gmail.com> wrote:
Hi. I notice that when I do a WHERE x, Postgres uses an index, and when I do WHERE y, it does so as well, but when I do WHERE x OR y, it doesn't. Why is this so? And how can I shut this off?select * from dict
where
word in (select substr('moon', 0, generate_series(3,length('moon')))) -- this is my X above
OR word like 'moon%' -- this is my Y above
Seq Scan on dict (cost=0.02..2775.66 rows=30422 width=24) (actual time=16.635..28.580 rows=8 loops=1)
Filter: ((hashed subplan) OR ((word)::text ~~ 'moon%'::text))
SubPlan
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.014..0.019 rows=2 loops=1)
Total runtime: 28.658 ms(Using just X or Y alone uses the index, and completes in 0.150 ms)Is this a bug?
On Sun, Jul 19, 2009 at 11:59 PM, Robert James<srobertjames@gmail.com> wrote: > PS Running "PostgreSQL 8.2.1 on i686-pc-mingw32, compiled by GCC gcc.exe > (GCC) 3.4.2 (mingw-special)" > > On Sun, Jul 19, 2009 at 6:58 PM, Robert James <srobertjames@gmail.com> > wrote: >> >> Hi. I notice that when I do a WHERE x, Postgres uses an index, and when I >> do WHERE y, it does so as well, but when I do WHERE x OR y, it doesn't. Why >> is this so? And how can I shut this off? >> select * from dict >> where >> word in (select substr('moon', 0, generate_series(3,length('moon')))) -- >> this is my X above >> OR word like 'moon%' -- this is my Y above >> Seq Scan on dict (cost=0.02..2775.66 rows=30422 width=24) (actual >> time=16.635..28.580 rows=8 loops=1) >> Filter: ((hashed subplan) OR ((word)::text ~~ 'moon%'::text)) >> SubPlan >> -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.014..0.019 >> rows=2 loops=1) >> Total runtime: 28.658 ms >> (Using just X or Y alone uses the index, and completes in 0.150 ms) >> Is this a bug? Well there are known bugs in 8.2.1 -- that's why the current 8.2 release is 8.2.13. The next step here is to set enable_seqscan=off and run explain analyze again. You may have to adjust some costs to sync the estimated cost with actual run-time. -- greg http://mit.edu/~gsstark/resume.pdf
Robert James <srobertjames@gmail.com> writes: > Hi. I notice that when I do a WHERE x, Postgres uses an index, and when I > do WHERE y, it does so as well, but when I do WHERE x OR y, it > doesn't. It can use indexes for OR conditions, but not for arbitrary OR conditions... > select * from dict > where > word in (select substr('moon', 0, generate_series(3,length('moon')))) -- > this is my X above > OR word like 'moon%' -- this is my Y above ... and that one is pretty arbitrary. You might have some luck with using a UNION instead, viz select * from dict where X union all select * from dict where Y regards, tom lane
Robert James wrote: > Hi. I notice that when I do a WHERE x, Postgres uses an index, and > when I do WHERE y, it does so as well, but when I do WHERE x OR y, it > doesn't. Why is this so? And how can I shut this off? maybe its because you have no index on (X OR Y) ? or maybe because the analyzer thinks that X or Y includes enough rows that a sequential scan is more effective ?
UNION was better, but still 5 times as slow as either query done individually.
set enable_seqscan=off didn't help at all - it was totally ignored
Is there anything else I can do?
On Sun, Jul 19, 2009 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert James <srobertjames@gmail.com> writes:It can use indexes for OR conditions, but not for arbitrary OR
> Hi. I notice that when I do a WHERE x, Postgres uses an index, and when I
> do WHERE y, it does so as well, but when I do WHERE x OR y, it
> doesn't.
conditions...... and that one is pretty arbitrary. You might have some luck with
> select * from dict
> where
> word in (select substr('moon', 0, generate_series(3,length('moon')))) --
> this is my X above
> OR word like 'moon%' -- this is my Y above
using a UNION instead, viz
select * from dict where X
union all
select * from dict where Y
regards, tom lane
Is there anyway to tell Postgres "Run these two queries, and union their results, but don't change the plan as to a UNION - just run them separately"?
Something seems funny to me that running a UNION should be twice as slow as running the two queries one after the other.
On Sun, Jul 19, 2009 at 8:10 PM, Robert James <srobertjames@gmail.com> wrote:
UNION was better, but still 5 times as slow as either query done individually.set enable_seqscan=off didn't help at all - it was totally ignoredIs there anything else I can do?On Sun, Jul 19, 2009 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Robert James <srobertjames@gmail.com> writes:It can use indexes for OR conditions, but not for arbitrary OR
> Hi. I notice that when I do a WHERE x, Postgres uses an index, and when I
> do WHERE y, it does so as well, but when I do WHERE x OR y, it
> doesn't.
conditions...... and that one is pretty arbitrary. You might have some luck with
> select * from dict
> where
> word in (select substr('moon', 0, generate_series(3,length('moon')))) --
> this is my X above
> OR word like 'moon%' -- this is my Y above
using a UNION instead, viz
select * from dict where X
union all
select * from dict where Y
regards, tom lane
On Sun, Jul 19, 2009 at 6:10 PM, Robert James<srobertjames@gmail.com> wrote: > UNION was better, but still 5 times as slow as either query done > individually. > set enable_seqscan=off didn't help at all - it was totally ignored > Is there anything else I can do? Did you try union, or union all?
Yes, I had done UNION. UNION ALL achives the expected plan and speed! Thank you!
BTW, this is interesting, because there are only about 5 or 6 rows max returned from both queries - but I guess the planner expects more and hence changes the plan to remove duplicates.
On Sun, Jul 19, 2009 at 9:05 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Sun, Jul 19, 2009 at 6:10 PM, Robert James<srobertjames@gmail.com> wrote:Did you try union, or union all?
> UNION was better, but still 5 times as slow as either query done
> individually.
> set enable_seqscan=off didn't help at all - it was totally ignored
> Is there anything else I can do?
On Mon, Jul 20, 2009 at 2:22 PM, Robert James<srobertjames@gmail.com> wrote: > BTW, this is interesting, because there are only about 5 or 6 rows max > returned from both queries - but I guess the planner expects more and hence > changes the plan to remove duplicates. If you sent the plans for the various attempts we might be able to explain what's going on. -- greg http://mit.edu/~gsstark/resume.pdf