Fighting the planner >:-( - Mailing list pgsql-performance

From Casey Allen Shobe
Subject Fighting the planner >:-(
Date
Msg-id CAFmVg3jE_HymBX558M5+Hh3iaUm7OAj_FVk98SGuGk8B3sWuvA@mail.gmail.com
Whole thread Raw
Responses Re: Fighting the planner >:-(  (Casey Allen Shobe <casey@shobe.info>)
List pgsql-performance
So where I'm working, a performance issue was identified that affected many functions, because the (SQL language) functions took an int argument used it in a where clause against a column (config_id) that was stored in varchar format, leading to an inefficient casting when the query was parameterized.  We could work around that with (select $3::text) instead of just $3, but since the data is actually all numbers under 65k, we altered the data type of the column to smallint, rather than editing a boatload of functions with a hacky workaround.

For most functions, this fixed the problem.

However, it had a drastically-negative impact on the query in question, which was originally taking 2 minutes, 45 seconds.  After adding a couple indexes with the config_id still as a varchar, that time is reduced down to 42 seconds.  However when the data type is smallint, the query runs for many hours - I let it run for 4.5 hours yesterday before cancelling it.

It's pretty clear that the planner is making horrid misestimates and picking a terrible plan.  I would appreciate any advice for getting this into a better state.

Here are the explain plans:

When config_id is a varchar, it executes in 42 seconds:

When config_id is a smallint, it runs too long to allow to complete, but clearly the plan is bad:

Here is the query, along with rowcounts and schema of every table involved in the query:

PostgreSQL version is 8.4, and most of our GUC's are default.

Thanks in advance for any suggestions.
--
Casey Allen Shobe
casey@shobe.info


pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Simple join doesn't use index
Next
From: Casey Allen Shobe
Date:
Subject: Re: Fighting the planner >:-(