Re: Deceiding which index to use - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Deceiding which index to use
Date
Msg-id 45F1775A.8030701@archonet.com
Whole thread Raw
In response to Re: Deceiding which index to use  (Mezei Zoltán <mezei.zoltan@telefor.hu>)
Responses Re: Deceiding which index to use
List pgsql-performance
Mezei Zoltán wrote:
> Richard Huxton wrote:
>  > Mezei Zoltán wrote:
>  > Q1. Why are you storing a numeric in a varchar?
>
> Because it's not always numeric info. :/
>
>  > Q2. How many unique values does anumber have? And how many rows in
>  > subscriber?
>
> About 10k distinct anumbers and 20k rows. Nothing special...

And does the planner know that?
SELECT * FROM pg_stats WHERE tablename='subscriber' AND attname='anumber';
It's the n_distinct you're interested in, and perhaps most_common_freqs.

>  > Q3. What happens if you create the index on plain (anumber) and then
>  > test against '555555555'?
>
> Nothing, everything is the same - the problem lies on the other table's index
> usage, using this index is fine.

The planner has to guess how many matches it will have for
subscriber=5555555. Based on that choice, it will either:
   a. Do the join, then find the highest crd values (sort)
   b. Scan the crd values backwards and then join
It's chosen (b) because it's estimating the numbers of matches
incorrectly. I'm wondering whether the system can't see through your
function-call (the cast to numeric) to determine how many matches it's
going to get for any given value.

If the system can't be persuaded into getting its estimates more
accurate, it might be worth trying an index on (subscriber_id,crd) and
dropping the index on (crd) - if that's reasonable for your query patterns.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: Mezei Zoltán
Date:
Subject: Re: Deceiding which index to use
Next
From: Mezei Zoltán
Date:
Subject: Re: Deceiding which index to use