Thread: Understanding sequential versus index scans.

Understanding sequential versus index scans.

From
Robert James
Date:
Hi.  I notice that when I do a WHERE x, Postgres uses an index, and when I do WHERE y, it does so as well, but when I do WHERE x OR y, it doesn't.  Why is this so? And how can I shut this off?
select * from dict
where 
 word in (select substr('moon', 0, generate_series(3,length('moon')))) -- this is my X above
 OR word like 'moon%' -- this is my Y above

Seq Scan on dict (cost=0.02..2775.66 rows=30422 width=24) (actual time=16.635..28.580 rows=8 loops=1)
 Filter: ((hashed subplan) OR ((word)::text ~~ 'moon%'::text))
 SubPlan
 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.014..0.019 rows=2 loops=1)
Total runtime: 28.658 ms
(Using just X or Y alone uses the index, and completes in 0.150 ms)
Is this a bug?




Re: Understanding sequential versus index scans.

From
Robert James
Date:
PS Running "PostgreSQL 8.2.1 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special)"

On Sun, Jul 19, 2009 at 6:58 PM, Robert James <srobertjames@gmail.com> wrote:
Hi.  I notice that when I do a WHERE x, Postgres uses an index, and when I do WHERE y, it does so as well, but when I do WHERE x OR y, it doesn't.  Why is this so? And how can I shut this off?
select * from dict
where 
 word in (select substr('moon', 0, generate_series(3,length('moon')))) -- this is my X above
 OR word like 'moon%' -- this is my Y above

Seq Scan on dict (cost=0.02..2775.66 rows=30422 width=24) (actual time=16.635..28.580 rows=8 loops=1)
 Filter: ((hashed subplan) OR ((word)::text ~~ 'moon%'::text))
 SubPlan
 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.014..0.019 rows=2 loops=1)
Total runtime: 28.658 ms
(Using just X or Y alone uses the index, and completes in 0.150 ms)
Is this a bug?





Re: Understanding sequential versus index scans.

From
Greg Stark
Date:
On Sun, Jul 19, 2009 at 11:59 PM, Robert James<srobertjames@gmail.com> wrote:
> PS Running "PostgreSQL 8.2.1 on i686-pc-mingw32, compiled by GCC gcc.exe
> (GCC) 3.4.2 (mingw-special)"
>
> On Sun, Jul 19, 2009 at 6:58 PM, Robert James <srobertjames@gmail.com>
> wrote:
>>
>> Hi.  I notice that when I do a WHERE x, Postgres uses an index, and when I
>> do WHERE y, it does so as well, but when I do WHERE x OR y, it doesn't.  Why
>> is this so? And how can I shut this off?
>> select * from dict
>> where
>>  word in (select substr('moon', 0, generate_series(3,length('moon')))) --
>> this is my X above
>>  OR word like 'moon%' -- this is my Y above
>> Seq Scan on dict (cost=0.02..2775.66 rows=30422 width=24) (actual
>> time=16.635..28.580 rows=8 loops=1)
>>  Filter: ((hashed subplan) OR ((word)::text ~~ 'moon%'::text))
>>  SubPlan
>>  -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.014..0.019
>> rows=2 loops=1)
>> Total runtime: 28.658 ms
>> (Using just X or Y alone uses the index, and completes in 0.150 ms)
>> Is this a bug?

Well there are known bugs in 8.2.1 -- that's why the current 8.2
release is 8.2.13.

The next step here is to set enable_seqscan=off and run explain
analyze again. You may have to adjust some costs to sync the estimated
cost with actual run-time.

--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Understanding sequential versus index scans.

From
Tom Lane
Date:
Robert James <srobertjames@gmail.com> writes:
> Hi.  I notice that when I do a WHERE x, Postgres uses an index, and when I
> do WHERE y, it does so as well, but when I do WHERE x OR y, it
> doesn't.

It can use indexes for OR conditions, but not for arbitrary OR
conditions...

> select * from dict
> where
>  word in (select substr('moon', 0, generate_series(3,length('moon')))) --
> this is my X above
>  OR word like 'moon%' -- this is my Y above

... and that one is pretty arbitrary.  You might have some luck with
using a UNION instead, viz

select * from dict where X
union all
select * from dict where Y

            regards, tom lane

Re: Understanding sequential versus index scans.

From
John R Pierce
Date:
Robert James wrote:
> Hi.  I notice that when I do a WHERE x, Postgres uses an index, and
> when I do WHERE y, it does so as well, but when I do WHERE x OR y, it
> doesn't.  Why is this so? And how can I shut this off?

maybe its because you have no index on (X OR Y) ?  or maybe because the
analyzer thinks that X or Y includes enough rows that a sequential scan
is more effective ?





Re: Understanding sequential versus index scans.

From
Robert James
Date:
UNION was better, but still 5 times as slow as either query done individually.
set enable_seqscan=off didn't help at all - it was totally ignored
Is there anything else I can do?

On Sun, Jul 19, 2009 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert James <srobertjames@gmail.com> writes:
> Hi.  I notice that when I do a WHERE x, Postgres uses an index, and when I
> do WHERE y, it does so as well, but when I do WHERE x OR y, it
> doesn't.

It can use indexes for OR conditions, but not for arbitrary OR
conditions...

> select * from dict
> where
>  word in (select substr('moon', 0, generate_series(3,length('moon')))) --
> this is my X above
>  OR word like 'moon%' -- this is my Y above

... and that one is pretty arbitrary.  You might have some luck with
using a UNION instead, viz

select * from dict where X
union all
select * from dict where Y

                       regards, tom lane

Re: Understanding sequential versus index scans.

From
Robert James
Date:
Is there anyway to tell Postgres "Run these two queries, and union their results, but don't change the plan as to a UNION - just run them separately"?
Something seems funny to me that running a UNION should be twice as slow as running the two queries one after the other.

On Sun, Jul 19, 2009 at 8:10 PM, Robert James <srobertjames@gmail.com> wrote:
UNION was better, but still 5 times as slow as either query done individually.
set enable_seqscan=off didn't help at all - it was totally ignored
Is there anything else I can do?

On Sun, Jul 19, 2009 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert James <srobertjames@gmail.com> writes:
> Hi.  I notice that when I do a WHERE x, Postgres uses an index, and when I
> do WHERE y, it does so as well, but when I do WHERE x OR y, it
> doesn't.

It can use indexes for OR conditions, but not for arbitrary OR
conditions...

> select * from dict
> where
>  word in (select substr('moon', 0, generate_series(3,length('moon')))) --
> this is my X above
>  OR word like 'moon%' -- this is my Y above

... and that one is pretty arbitrary.  You might have some luck with
using a UNION instead, viz

select * from dict where X
union all
select * from dict where Y

                       regards, tom lane


Re: Understanding sequential versus index scans.

From
Scott Marlowe
Date:
On Sun, Jul 19, 2009 at 6:10 PM, Robert James<srobertjames@gmail.com> wrote:
> UNION was better, but still 5 times as slow as either query done
> individually.
> set enable_seqscan=off didn't help at all - it was totally ignored
> Is there anything else I can do?

Did you try union, or union all?

Re: Understanding sequential versus index scans.

From
Robert James
Date:
Yes, I had done UNION.  UNION ALL achives the expected plan and speed! Thank you!
BTW, this is interesting, because there are only about 5 or 6 rows max returned from both queries - but I guess the planner expects more and hence changes the plan to remove duplicates.

On Sun, Jul 19, 2009 at 9:05 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Sun, Jul 19, 2009 at 6:10 PM, Robert James<srobertjames@gmail.com> wrote:
> UNION was better, but still 5 times as slow as either query done
> individually.
> set enable_seqscan=off didn't help at all - it was totally ignored
> Is there anything else I can do?

Did you try union, or union all?

Re: Understanding sequential versus index scans.

From
Greg Stark
Date:
On Mon, Jul 20, 2009 at 2:22 PM, Robert James<srobertjames@gmail.com> wrote:
> BTW, this is interesting, because there are only about 5 or 6 rows max
> returned from both queries - but I guess the planner expects more and hence
> changes the plan to remove duplicates.


If you sent the plans for the various attempts we might be able to
explain what's going on.

--
greg
http://mit.edu/~gsstark/resume.pdf