Re: Select performance variation based on the different combinations of using where lower(), order by, and limit - Mailing list pgsql-general

From Tyler Reese
Subject Re: Select performance variation based on the different combinations of using where lower(), order by, and limit
Date
Msg-id CAFvRLyeFZZq5wk4ovpYQhYtc8HxxF93u40LphhRhQXiDJonkyA@mail.gmail.com
Whole thread Raw
In response to Re: Select performance variation based on the different combinations of using where lower(), order by, and limit  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: Select performance variation based on the different combinations of using where lower(), order by, and limit  (Kevin Grittner <kgrittn@ymail.com>)
Re: Select performance variation based on the different combinations of using where lower(), order by, and limit  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-general
So, since it thinks it needs to read 1/412th of the table is the reason why the query planner chooses to use the primary key index instead of the callingpartynumber index, like it does in the first 3 cases?  I'm curious as to why it says "rows=41212".  Is that the estimate of the number of rows that meet the filter condition?  Where does that come from?

I haven't heard of raising the statistics target, so I'll read up on that.  A few days ago, all 4 cases were responding equally fast.  I had been messing around with the postgres settings, and I went and dropped all of the indexes and recreated them just to see what would happen.  I wouldn't think that recreating the indexes would cause case 4 to go slow, but that's the symptom I am seeing now.  Should I be running analyze on a table after it has been reindexed?


On Sun, Aug 18, 2013 at 3:02 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
Tyler Reese <jukey91@gmail.com> wrote:

> I don't understand why the performance of case 4 is so much slower

>case 4:
>mydb=> explain analyze SELECT * FROM "cdr" WHERE lower("CallingPartyNumber") = '9725551212' order by "key" limit 100;

> Limit  (cost=0.00..72882.05 rows=100 width=757) (actual time=20481.083..30464.960 rows=11 loops=1)
>   ->  Index Scan using cdr_pkey on cdr  (cost=0.00..30036152.32 rows=41212 width=757) (actual time=20481.049..30464.686 rows=11 loops=1)
>         Filter: (lower("CallingPartyNumber") = '9725551212'::text)
> Total runtime: 30465.246 ms

It thinks that it will only need to read 1/412th of the table to
find 100 matching rows, and using that index it will be able to
skip the sort.  Since there aren't 100 matching rows, it has to
read the whole table through the index.  Raising the statistics
target and running ANALYZE might allow it to use a more accurate
estimate, and thereby make a better choice.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Query on a record variable
Next
From: Michael Paquier
Date:
Subject: Re: pg_basebackup from new master's slave then recovery from new master.