Thread: Or selection on index versus union
Hello I've got a table with an index, let's call it fase. The following query is fine: 'select something from table where fase = '1'; However, this is disastrously slow: select something from table where fase = '1' or fase = '2'; The reason is that the query planner decides to ignore the index, and goes for a sequential scan of the table (with a couple of million records). If I do: select something from table where fase = '1' union select something from table where fase = '2'; it's fine again, but it's a lot of typing, and the first formulation has a more natural feel to it. Is there a way to convince the planner to use the fase index for this type of query, or is there a hook somewhere that I missed that allows me to rewrite a query like the above with a server-side function ? Thanks in advance, Han Holl
Just a guess but have you tried using an in clause? select something from table where fase in ('1' , '2'); On Tuesday 04 October 2005 13:32, han.holl@informationslogik.nl wrote: > Hello > > I've got a table with an index, let's call it fase. > > The following query is fine: 'select something from table where fase = '1'; > > However, this is disastrously slow: > select something from table where fase = '1' or fase = '2'; > > The reason is that the query planner decides to ignore the index, and goes > for a sequential scan of the table (with a couple of million records). > > If I do: > select something from table where fase = '1' > union > select something from table where fase = '2'; > > it's fine again, but it's a lot of typing, and the first formulation has a > more natural feel to it. > > Is there a way to convince the planner to use the fase index for this type > of query, or is there a hook somewhere that I missed that allows me to > rewrite a query like the above with a server-side function ? > > Thanks in advance, > > Han Holl > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
On Tue, Oct 04, 2005 at 09:32:41PM +0200, han.holl@informationslogik.nl wrote: > I've got a table with an index, let's call it fase. > > The following query is fine: 'select something from table where fase = '1'; > > However, this is disastrously slow: > select something from table where fase = '1' or fase = '2'; Could we see some EXPLAIN ANALYZE output? What version of PostgreSQL are you using? Have you run VACUUM ANALYZE on the table to remove dead tuples and update the statistics? Have you considered clustering the table on fase's index? It might be useful to see the output of the following queries, assuming the table name is foo: SET enable_seqscan TO on; SET enable_indexscan TO off; EXPLAIN ANALYZE SELECT something FROM foo WHERE fase = '1'; EXPLAIN ANALYZE SELECT something FROM foo WHERE fase = '2'; EXPLAIN ANALYZE SELECT something FROM foo WHERE fase = '1' OR fase = '2'; SET enable_seqscan TO off; SET enable_indexscan TO on; EXPLAIN ANALYZE SELECT something FROM foo WHERE fase = '1'; EXPLAIN ANALYZE SELECT something FROM foo WHERE fase = '2'; EXPLAIN ANALYZE SELECT something FROM foo WHERE fase = '1' OR fase = '2'; SHOW random_page_cost; SHOW effective_cache_size; SELECT version(); \x SELECT * FROM pg_stats WHERE tablename = 'foo' AND attname = 'fase'; BTW, pgsql-performance is a list dedicated to performance issues, so you might want to ask performance-related questions there. -- Michael Fuhr
On Tuesday 04 October 2005 22:26, kevin.kempter@dataintellect.com wrote: > Just a guess but have you tried using an in clause? > > select something from table where fase in ('1' , '2'); > Not surprisingly, this also does a sequential scan on the table. (It would be _very_ diffcult to see that this is a union, I guess). Thanks, Han
On Tuesday 04 October 2005 23:08, Michael Fuhr wrote:
> On Tue, Oct 04, 2005 at 09:32:41PM +0200, han.holl@informationslogik.nl wrote:
> > I've got a table with an index, let's call it fase.
> >
> > The following query is fine: 'select something from table where fase =
> > '1';
> >
> > However, this is disastrously slow:
> > select something from table where fase = '1' or fase = '2';
>
> Could we see some EXPLAIN ANALYZE output? What version of PostgreSQL
> are you using? Have you run VACUUM ANALYZE on the table to remove
> dead tuples and update the statistics? Have you considered clustering
> the table on fase's index?
>
Oh, well, thanks. I hadn't realized that a newly loaded database needs a vacuum analyze to begin with. And what's worse, I had the impression that vacuum full would include analyze, wich I see now it doesn't.
I'm not a database administrator, and I'm afraid it shows. I'm going to read a lot of docs in the coming months, because real people depend on reasonable performance of our databases.
Cheers, and thanks again,
Han Holl
PS We still have to be careful how to formulate conditions:
where fase in ('1','2')
is ok, and uses the index, but the logically identical:
where position(fase in '12') >= 1
does a sequential scan.
han.holl@informationslogik.nl writes: > PS We still have to be careful how to formulate conditions: > > where fase in ('1','2') > > is ok, and uses the index, but the logically identical: > > where position(fase in '12') >= 1 > > does a sequential scan. Right, you would need a functional index to fix that. The query optimizer, in general, doesn't know what specific functions do--they're black boxes. So there would be no way for the optimizer to figure out that the two expressions mean the same thing. -Doug
On Wed, Oct 05, 2005 at 11:13:58AM +0200, han.holl@informationslogik.nl wrote: > On Tuesday 04 October 2005 23:08, Michael Fuhr wrote: > > On Tue, Oct 04, 2005 at 09:32:41PM +0200, han.holl@informationslogik.nl wrote: > > > I've got a table with an index, let's call it fase. > > > > > > The following query is fine: 'select something from table where fase = > > > '1'; > > > > > > However, this is disastrously slow: > > > select something from table where fase = '1' or fase = '2'; > > > > Could we see some EXPLAIN ANALYZE output? What version of PostgreSQL > > are you using? Have you run VACUUM ANALYZE on the table to remove > > dead tuples and update the statistics? Have you considered clustering > > the table on fase's index? > > > > Oh, well, thanks. I hadn't realized that a newly loaded database needs a vacuum analyze to begin with. And what's worse,I had the impression that vacuum full would include analyze, wich I see now it doesn't. When first loaded, there are no statistics for a table, so the database has no idea what to expect. You don't need to vacuum a brand-new table, but you do need to analyze it. We've actually talked about changing the name of vacuum full so as not to confuse people. But you really want to stay away from vacuum full unless you have no choice, because of the exclusive lock on the table it grabs. > I'm not a database administrator, and I'm afraid it shows. I'm going to read a lot of docs in the coming months, becausereal people depend on reasonable performance of our databases. If you need immediate help you can get commercial support from a number of companies. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461