Re: PostgreSQL OR performance - Mailing list pgsql-performance

From Віталій Тимчишин
Subject Re: PostgreSQL OR performance
Date
Msg-id 331e40660811060937q460e6f9fvf336455200d31f56@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL OR performance  (Richard Huxton <dev@archonet.com>)
Responses Re: PostgreSQL OR performance  ("Віталій Тимчишин" <tivv00@gmail.com>)
List pgsql-performance


2008/11/6 Richard Huxton <dev@archonet.com>
Віталій Тимчишин wrote:
> As you can see from other plans, it do have all the indexes to perform it's
> work fast (when given part by part). It simply do not wish to use them. My
> question: Is this a configuration problem or postgresql optimizer simply
> can't do such a query rewrite?

I must admit, I haven't managed to figure out what your query is trying
to do, but then that's a common problem with autogenerated queries.

That's easy - I am looking for duplicates from subset of companies. Two companies are equal when there names are simply equal or there is an entry in "match" table for names.
 


The main question that needs answering is why the planner thinks you're
going to get 1.3 billion rows in the "or" query:

"Nested Loop  (cost=4588.13..960900482668.95 rows=1386158171 width=32)"

You don't show "explain analyse" for this query, so there's no way of
knowing how many rows get returned but presumably you're expecting
around 88000. What does "explain analyse" return?

Yes, the query should output exactly same result as in "Union" plan. I will run "slow" explain analyze now and will repost after it will complete (tomorrow?).
BTW: I'd say planner should think rows estimated as sum of "ORs" estimation minus intersection, but no more then sum or ORs (if intersection is 0). For first condition it has rows=525975, for second it has rows=2403 (with other plans, of course), so it's strange it has such a high estimation.... It's exactly 50% of full cartesian join of merge, so it does think that every second pair would succeed, that is not true.

pgsql-performance by date:

Previous
From: "Helio Campos Mello de Andrade"
Date:
Subject: Re: PostgreSQL OR performance
Next
From: "David Rees"
Date:
Subject: Re: Occasional Slow Commit