Re: [Gmod-schema] Re: performace problem after VACUUM - Mailing list pgsql-performance
From | Scott Cain |
---|---|
Subject | Re: [Gmod-schema] Re: performace problem after VACUUM |
Date | |
Msg-id | 1045341390.3944.678.camel@localhost.localdomain Whole thread Raw |
In response to | Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
Hello Tom, Here's the short answer: I've got it working much faster now (>100 msec for the query by explain analyze). Here's the long answer: I reworked the table, horribly denormalizing it. I changed the coordinate system, so that start is always less than end, regardless of strand. Here is the original query: select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id from feature f, featureloc fl where fl.srcfeature_id = 1 and ((fl.strand=1 and fl.nbeg <= 393164 and fl.nend >= 390956) OR (fl.strand=-1 and fl.nend <= 393164 and fl.nbeg >= 390956)) and f.feature_id = fl.feature_id and here is the equivalent query in the new coordinate system: select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id from feature f, featureloc fl where fl.srcfeature_id = 1 and f.feature_id = fl.feature_id and fl.max >= 390956 and fl.min <= 393164 Notice that it is MUCH simpler, and the query planner uses exactly the indexes I want, and as noted above, runs much faster. Of course, this also means that I have to rewrite my database adaptor, but it shouldn't be too bad. For those on the GMOD list, here is how I changed the table: alter table featureloc add column min int; alter table featureloc add column max int; update featureloc set min=nbeg where strand=1; update featureloc set max=nend where strand=1; update featureloc set max=nbeg where strand=-1; update featureloc set min=nend where strand=-1; update featureloc set min=nbeg where (strand=0 or strand is null) and nbeg<nend; update featureloc set max=nend where (strand=0 or strand is null) and nbeg<nend; update featureloc set min=nend where (strand=0 or strand is null) and nbeg>nend; update featureloc set max=nbeg where (strand=0 or strand is null) and nbeg>nend; create index featureloc_src_min_max on featureloc (srcfeature_id,min,max); select count(*) from featureloc where min is null and nbeg is not null; The last select is just a test to make sure I didn't miss anything, and it did return zero. Also, it doesn't appear that there are any features that are strandless. I found that a little surprising, but included those updates for completeness. Tom, thank you much for your help. Hopefully, I will get the group to buy into this schema change, and life will be good. Scott On Fri, 2003-02-14 at 19:11, Tom Lane wrote: > SLetovsky@aol.com writes: > > Am I correct in interpreting your comments as saying you believe that > > if we could lose the OR and the strand constraint PG would probably > > use the index properly? > > No, I said I thought it could do it without that ;-). But yes, you'd > have a much less fragile query if you could lose the OR condition. > > Have you looked into using a UNION ALL instead of OR to merge the two > sets of results? It sounds grotty, but might be faster... > > regards, tom lane -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory
pgsql-performance by date: