Re: Weird, bad 0.5% selectivity estimate for a column equal to itself - Mailing list pgsql-performance

From Josh Berkus
Subject Re: Weird, bad 0.5% selectivity estimate for a column equal to itself
Date
Msg-id 51CA2370.7040803@agliodbs.com
Whole thread Raw
In response to Weird, bad 0.5% selectivity estimate for a column equal to itself  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Weird, bad 0.5% selectivity estimate for a column equal to itself  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
> Personally, I'll bet lunch that that external software is outright
> broken, ie it probably thinks "X = X" is constant true and they found
> they could save two lines of code and a few machine cycles by emitting
> that rather than not emitting anything.  Of course, the amount of
> parsing/planning time wasted in dealing with the useless-and-incorrect
> clause exceeds what was saved by multiple orders of magnitude, but hey
> it was easy.

Well, it was more in the form of:

tab1.x = COALESCE(tab2.y,tab1.x)

... which some programmer 8 years ago though would be a cute shorthand for:

tab.x = tab2.y OR tab2.y IS NULL

Still stupid, sure, but when you're dealing with partly-third-party
legacy software which was ported from MSSQL (which has issues with "IS
NULL"), that's what you get.

> It wouldn't take too much new code to get the planner to replace "X = X"
> with "X IS NOT NULL", but I think we're probably fixing the wrong piece
> of software if we do.

Well, I'd be more satisfied with having a solution for:

WHERE tab1.x = tab1.y

... in general, even if it didn't have correlation stats.  Like, what's
preventing us from using the same selectivity logic we would on a join
for that?  It wouldn't be accurate for highly correlated columns (or for
colX = colx) but it would be a damsight better than defaultsel. Heck,
even multiplying the the two ndistincts together would be an improvement ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


pgsql-performance by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: incorrect row estimates for primary key join
Next
From: Tom Lane
Date:
Subject: Re: incorrect row estimates for primary key join