Thread: varchar index joins not working?

varchar index joins not working?

From
"Adam Gundy"
Date:
I'm hitting an unexpected problem with postgres 8.3 - I have some
tables which use varchar(32) for their unique IDs which I'm attempting
to join using some simple SQL:

select *
from group_access, groups
where group_access.groupid = groups.groupid and
         group_access.uid = '7275359408f44591d0717e16890ce335';

there's a unique index on group_access.groupid, and a non-unique index
on groups.groupid. both are non-null.

the problem is: if groupid (in both tables) is varchar, I cannot force
postgres (no matter how hard I try) to do an index scan. it ends up
reading the entire groups table (pretty large!):

                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=8.89..41329.88 rows=119940 width=287) (actual
time=0.202..935.136 rows=981 loops=1)
   Hash Cond: ((groups.groupid)::text = (group_access.groupid)::text)
   ->  Seq Scan on groups  (cost=0.00..31696.48 rows=1123348
width=177) (actual time=0.011..446.091 rows=1125239 loops=1)
   ->  Hash  (cost=8.51..8.51 rows=30 width=110) (actual
time=0.148..0.148 rows=30 loops=1)
         ->  Seq Scan on group_access  (cost=0.00..8.51 rows=30
width=110) (actual time=0.014..0.126 rows=30 loops=1)
               Filter: ((uid)::text = '7275359408f44591d0717e16890ce335'::text)
 Total runtime: 935.443 ms
(7 rows)

if I disable seq_scan, I get this:

                                                                   QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=1.47..106189.61 rows=120004 width=287) (actual
time=0.100..1532.353 rows=981 loops=1)
   Merge Cond: ((group_access.groupid)::text = (groups.groupid)::text)
   ->  Index Scan using group_access_pkey on group_access
(cost=0.00..43.91 rows=30 width=110) (actual time=0.044..0.148 rows=30
loops=1)
         Index Cond: ((uid)::text = '7275359408f44591d0717e16890ce335'::text)
   ->  Index Scan using groups_1_idx on groups  (cost=0.00..102135.71
rows=1123952 width=177) (actual time=0.031..856.555 rows=1125827
loops=1)
 Total runtime: 1532.880 ms
(6 rows)

it's running an index scan across the entire table (no condition applied) :-(

so, just for the hell of it, I tried making groupid a char(32),
despite repeated assertions in this group that there's no performance
difference between the two:

                                                             QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=4.48..253.85 rows=304 width=291) (actual
time=0.715..22.906 rows=984 loops=1)
   ->  Bitmap Heap Scan on group_access  (cost=4.48..9.86 rows=30
width=111) (actual time=0.372..0.570 rows=30 loops=1)
         Recheck Cond: (uid = '7275359408f44591d0717e16890ce335'::bpchar)
         ->  Bitmap Index Scan on group_access_uid_key
(cost=0.00..4.48 rows=30 width=0) (actual time=0.331..0.331 rows=30
loops=1)
               Index Cond: (uid = '7275359408f44591d0717e16890ce335'::bpchar)
   ->  Index Scan using groups_1_idx on groups  (cost=0.00..7.96
rows=14 width=180) (actual time=0.176..0.396 rows=33 loops=30)
         Index Cond: (groups.groupid = group_access.groupid)
 Total runtime: 26.837 ms
(8 rows)

(this last plan is actually against a smaller test DB, but I get the
same behavior with it, seq scan for varchar or index scan for char,
and the results returned are identical for this query)

the databases are UTF-8, if that makes a difference...

Re: varchar index joins not working?

From
Richard Huxton
Date:
Adam Gundy wrote:
> I'm hitting an unexpected problem with postgres 8.3 - I have some
> tables which use varchar(32) for their unique IDs which I'm attempting
> to join using some simple SQL:
>
> select *
> from group_access, groups
> where group_access.groupid = groups.groupid and
>          group_access.uid = '7275359408f44591d0717e16890ce335';
>
> there's a unique index on group_access.groupid, and a non-unique index
> on groups.groupid. both are non-null.

What about group_access.uid - I'd have thought that + groups pkey is
probably the sensible combination here.

> the problem is: if groupid (in both tables) is varchar, I cannot force
> postgres (no matter how hard I try) to do an index scan. it ends up
> reading the entire groups table (pretty large!):

OK

>                                                         QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=8.89..41329.88 rows=119940 width=287) (actual
> time=0.202..935.136 rows=981 loops=1)

That's because it's expecting 119,940 rows to match (rather than the
actual 981 you do get). If you were getting that many results this is
probably a sensible plan.

>    Hash Cond: ((groups.groupid)::text = (group_access.groupid)::text)
>    ->  Seq Scan on groups  (cost=0.00..31696.48 rows=1123348
> width=177) (actual time=0.011..446.091 rows=1125239 loops=1)

It's got a good idea of the total number of rows in groups.

>    ->  Hash  (cost=8.51..8.51 rows=30 width=110) (actual
> time=0.148..0.148 rows=30 loops=1)
>          ->  Seq Scan on group_access  (cost=0.00..8.51 rows=30
> width=110) (actual time=0.014..0.126 rows=30 loops=1)

And also group_access. Oh, the seq-scan doesn't really matter here. It
probably *is* faster to read all 30 rows in one burst rather than go to
the index and then back to the table.

>                Filter: ((uid)::text = '7275359408f44591d0717e16890ce335'::text)
>  Total runtime: 935.443 ms
> (7 rows)
>
> if I disable seq_scan, I get this:
>
>                                                                    QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=1.47..106189.61 rows=120004 width=287) (actual
> time=0.100..1532.353 rows=981 loops=1)

It's still thinking it's going to get 120 thousand rows.

> it's running an index scan across the entire table (no condition applied) :-(
>
> so, just for the hell of it, I tried making groupid a char(32),
> despite repeated assertions in this group that there's no performance
> difference between the two:

There's no performance difference between the two.

>  Nested Loop  (cost=4.48..253.85 rows=304 width=291) (actual
> time=0.715..22.906 rows=984 loops=1)

> (this last plan is actually against a smaller test DB, but I get the
> same behavior with it, seq scan for varchar or index scan for char,
> and the results returned are identical for this query)

The char(32) thing isn't important here, what is important is that it's
expecting ~300 rows rather than 120,000. It's still wrong, but it's
close enough to make sense.

So - the question is - why is PG expecting so many matches to your join.
How many distinct values do you have in groups.groupid and
group_access.group_id?

--
   Richard Huxton
   Archonet Ltd

Re: varchar index joins not working?

From
Adam Gundy
Date:
Richard Huxton wrote:
> Adam Gundy wrote:
>> I'm hitting an unexpected problem with postgres 8.3 - I have some
>> tables which use varchar(32) for their unique IDs which I'm attempting
>> to join using some simple SQL:
>>
>> select *
>> from group_access, groups
>> where group_access.groupid = groups.groupid and
>>          group_access.uid = '7275359408f44591d0717e16890ce335';
>>
>> there's a unique index on group_access.groupid, and a non-unique index
>> on groups.groupid. both are non-null.
>
> What about group_access.uid - I'd have thought that + groups pkey is
> probably the sensible combination here.

that is an index on group_access:

"group_access_pkey" PRIMARY KEY, btree (groupid, uid)

adding the (uid, groupid) index helps the small database, it will do an
index join if forced to, but the full database still refuses to do an
index join - it does a full index scan followed by a merge.

>>                                                         QUERY PLAN
>>
--------------------------------------------------------------------------------------------------------------------------

>>
>>  Hash Join  (cost=8.89..41329.88 rows=119940 width=287) (actual
>> time=0.202..935.136 rows=981 loops=1)
>
> That's because it's expecting 119,940 rows to match (rather than the
> actual 981 you do get). If you were getting that many results this is
> probably a sensible plan.

sure. but it's estimate is *wildly* off

>>    Hash Cond: ((groups.groupid)::text = (group_access.groupid)::text)
>>    ->  Seq Scan on groups  (cost=0.00..31696.48 rows=1123348
>> width=177) (actual time=0.011..446.091 rows=1125239 loops=1)
>
> It's got a good idea of the total number of rows in groups.

yeah.

>>    ->  Hash  (cost=8.51..8.51 rows=30 width=110) (actual
>> time=0.148..0.148 rows=30 loops=1)
>>          ->  Seq Scan on group_access  (cost=0.00..8.51 rows=30
>> width=110) (actual time=0.014..0.126 rows=30 loops=1)
>
> And also group_access. Oh, the seq-scan doesn't really matter here. It
> probably *is* faster to read all 30 rows in one burst rather than go to
> the index and then back to the table.

agreed.

>> it's running an index scan across the entire table (no condition
>> applied) :-(
>>
>> so, just for the hell of it, I tried making groupid a char(32),
>> despite repeated assertions in this group that there's no performance
>> difference between the two:
>
> There's no performance difference between the two.

hah. if it makes the join with char (and runs fast), or reads the whole
table with varchar, then there *is* a performance difference - a big one!

> The char(32) thing isn't important here, what is important is that it's
> expecting ~300 rows rather than 120,000. It's still wrong, but it's
> close enough to make sense.

> So - the question is - why is PG expecting so many matches to your join.

more to the point, why does it get the estimate right (or close) with
char, but massively wrong with varchar? I've been vacuum analyzing after
each change..

with the smaller database, and char type, it (for certain joins) still
wants to do a seqscan because the tables are small enough, but if I
disable seqscan, it does an index join (usually with a small time
penalty). if I switch the types back to varchar, re-analyze, re-run, it
*will not* do an index join!

> How many distinct values do you have in groups.groupid and
> group_access.group_id?

for the small database (since it shows the same problem):

group_access: 280/268
groups: 2006/139

for the large database:

group_access: same
groups: 1712647/140

the groupid key is an MD5 hash, so it should be uniformly distributed.
maybe that throws the stats? but, again, char works, varchar doesn't :-(


Attachment

Re: varchar index joins not working?

From
"Adam Gundy"
Date:
On Thu, Apr 10, 2008 at 8:52 AM, Adam Gundy <adam@starsilk.net> wrote:
> Richard Huxton wrote:
> > How many distinct values do you have in groups.groupid and
> group_access.group_id?
> >
>
>  for the small database (since it shows the same problem):
>
>  group_access: 280/268
>  groups: 2006/139
>
>  for the large database:
>
>  group_access: same
>  groups: 1712647/140
>
>  the groupid key is an MD5 hash, so it should be uniformly distributed.
> maybe that throws the stats? but, again, char works, varchar doesn't :-(

OK, I'm thinking the varchar/char part is not the issue.

the database is very unbalanced, most of the groups are 1000 or less
records, with one group occupying 95% of the records.

I *think* that when I analyze using char instead of varchar, it is
recording a stat for the large group, but for some reason with varchar
doesn't add a stat for that one.

so, the real question is, how do I fix this? I can turn the stats way
up to 1000, but that doesn't guarantee that I'll get a stat for the
large group :-(

can I turn the statistics off completely for this column? I'm guessing
that if I can, that will mean it takes a guess based on the number of
distinct values in the groups table, which is still large number of
records, possibly enough to trigger the seqscan anyway.

does postgres have a way of building a 'counted index' that the
planner can use for it's record counts? some way of forcibly
maintaining a stat for every group?

the groups are not related to one another - is it possible to
partition them into their own indexes somehow?

ahh. lots of questions, no (obvious to me) answers from googling around.

Re: varchar index joins not working?

From
Richard Huxton
Date:
Adam Gundy wrote:
> On Thu, Apr 10, 2008 at 8:52 AM, Adam Gundy <adam@starsilk.net> wrote:
>> Richard Huxton wrote:
>>> How many distinct values do you have in groups.groupid and
>> group_access.group_id?
>>  for the small database (since it shows the same problem):
>>
>>  group_access: 280/268
>>  groups: 2006/139
>>
>>  for the large database:
>>
>>  group_access: same
>>  groups: 1712647/140
>>
>>  the groupid key is an MD5 hash, so it should be uniformly distributed.
>> maybe that throws the stats? but, again, char works, varchar doesn't :-(
>
> OK, I'm thinking the varchar/char part is not the issue.

Good, because it's not :-)

> the database is very unbalanced, most of the groups are 1000 or less
> records, with one group occupying 95% of the records.

I was wondering - that's why I asked for the stats.

> I *think* that when I analyze using char instead of varchar, it is
> recording a stat for the large group, but for some reason with varchar
> doesn't add a stat for that one.
>
> so, the real question is, how do I fix this? I can turn the stats way
> up to 1000, but that doesn't guarantee that I'll get a stat for the
> large group :-(

Well, by default it will be tracking the 10 most common values (and how
often they occur). As you say, this can be increased to 1000 (although
it obviously takes longer to check 1000 rather than 10).

We can have a look at the stats with something like:
SELECT * FROM pg_stats WHERE tablename='group_access' AND attname='uid';
You'll be interested in n_distinct, most_common_vals and most_common_freqs.

However, I think the problem may be that PG doesn't track cross-column
stats, so it doesn't know that a particular uid implies one or more
particular groupid values.

> can I turn the statistics off completely for this column? I'm guessing
> that if I can, that will mean it takes a guess based on the number of
> distinct values in the groups table, which is still large number of
> records, possibly enough to trigger the seqscan anyway.

No - can't disable stats. Besides, you want it the other way around -
index scans for all groups except the largest.

> does postgres have a way of building a 'counted index' that the
> planner can use for it's record counts? some way of forcibly
> maintaining a stat for every group?

No, but let's see what's in pg_stats.

> the groups are not related to one another - is it possible to
> partition them into their own indexes somehow?

Yes, but it will depend on having an explicit group_id=... clause in the
query as well as on the index. That's not going to help you here.

--
   Richard Huxton
   Archonet Ltd

Re: varchar index joins not working?

From
Adam Gundy
Date:
Richard Huxton wrote:
> Adam Gundy wrote:
>> On Thu, Apr 10, 2008 at 8:52 AM, Adam Gundy <adam@starsilk.net> wrote:
>>> Richard Huxton wrote:
>>>> How many distinct values do you have in groups.groupid and
>>> group_access.group_id?
>>>  for the small database (since it shows the same problem):
>>>
>>>  group_access: 280/268
>>>  groups: 2006/139
>>>
>>>  for the large database:
>>>
>>>  group_access: same
>>>  groups: 1712647/140
>>>
>>>  the groupid key is an MD5 hash, so it should be uniformly distributed.
>>> maybe that throws the stats? but, again, char works, varchar doesn't :-(
>>
>> OK, I'm thinking the varchar/char part is not the issue.
>
> Good, because it's not :-)

hmm. unfortunately it did turn out to be (part) of the issue. I've
discovered that mixing char and varchar in a stored procedure does not
coerce the types, and ends up doing seq scans all the time.

changing something like this:

proc x ( y char(32) )
{
    select * from groups where groupid = y
}

into this:

proc x ( y varchar(32) )
{
    select * from groups where groupid = y
}

and suddenly postgres does index lookups in the stored proc... way faster.

>> I *think* that when I analyze using char instead of varchar, it is
>> recording a stat for the large group, but for some reason with varchar
>> doesn't add a stat for that one.
>>
>> so, the real question is, how do I fix this? I can turn the stats way
>> up to 1000, but that doesn't guarantee that I'll get a stat for the
>> large group :-(
>
> Well, by default it will be tracking the 10 most common values (and how
> often they occur). As you say, this can be increased to 1000 (although
> it obviously takes longer to check 1000 rather than 10).
>
> We can have a look at the stats with something like:
> SELECT * FROM pg_stats WHERE tablename='group_access' AND attname='uid';
> You'll be interested in n_distinct, most_common_vals and most_common_freqs.
>
> However, I think the problem may be that PG doesn't track cross-column
> stats, so it doesn't know that a particular uid implies one or more
> particular groupid values.

I doubt we could get stats stable enough for this. the number of groups
will hopefully be much larger at some point.

it's a shame the index entries can't be used to provide information to
the planner, eg a rough count of the number of entries for a given key
(or subset). it would be nice to be able to create eg a counted btree
when you know you have this kind of data as a hint to the planner.

>> can I turn the statistics off completely for this column? I'm guessing
>> that if I can, that will mean it takes a guess based on the number of
>> distinct values in the groups table, which is still large number of
>> records, possibly enough to trigger the seqscan anyway.
>
> No - can't disable stats. Besides, you want it the other way around -
> index scans for all groups except the largest.

actually, disabling seqscan at the server level gives extremely good
response times. I ended up rewriting a few queries that were scanning
the whole group for no good reason, and bitmap index hashing seems to
take care of things nicely.

queries have gone from 30+ seconds to < 0.1 seconds.

>> does postgres have a way of building a 'counted index' that the
>> planner can use for it's record counts? some way of forcibly
>> maintaining a stat for every group?
>
> No, but let's see what's in pg_stats.

no real help there. either it hits the group being read, and does a good
plan, or it doesn't, and tries to seqscan (unless I disable it). even
forcing stats to 1000 only bandaids the situation, given the number of
groups will eventually exceed that..


Attachment

Re: varchar index joins not working?

From
Tom Lane
Date:
Adam Gundy <adam@starsilk.net> writes:
> hmm. unfortunately it did turn out to be (part) of the issue. I've
> discovered that mixing char and varchar in a stored procedure does not
> coerce the types, and ends up doing seq scans all the time.

Oh, it coerces the type all right, just not in the direction you'd like.

regression=# create table v (f1 varchar(32) primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "v_pkey" for table "v"
CREATE TABLE
regression=# explain select * from v where f1 = 'abc'::varchar;
                           QUERY PLAN
-----------------------------------------------------------------
 Index Scan using v_pkey on v  (cost=0.00..8.27 rows=1 width=34)
   Index Cond: ((f1)::text = 'abc'::text)
(2 rows)

regression=# explain select * from v where f1 = 'abc'::char(3);
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on v  (cost=0.00..25.88 rows=1 width=34)
   Filter: ((f1)::bpchar = 'abc'::character(3))
(2 rows)

            regards, tom lane

Re: varchar index joins not working?

From
Richard Huxton
Date:
Adam Gundy wrote:
> I doubt we could get stats stable enough for this. the number of groups
> will hopefully be much larger at some point.

The pg_stats table should be recording the n most-common values, so if
you have 1 million groups you track details of the 1000 most-common.
That gives you a maximum for how common any value not in the stats can be.

>> No, but let's see what's in pg_stats.
>
> no real help there. either it hits the group being read, and does a good
> plan, or it doesn't, and tries to seqscan (unless I disable it). even
> forcing stats to 1000 only bandaids the situation, given the number of
> groups will eventually exceed that..

Like I say, that's not the point of gathering the stats. If one group
represents 95% of your rows, then its group-id should be almost certain
to occur in the stats. Are you saying that's not happening with your data?

--
   Richard Huxton
   Archonet Ltd

Re: varchar index joins not working?

From
Adam Gundy
Date:
Tom Lane wrote:
> Adam Gundy <adam@starsilk.net> writes:
>> hmm. unfortunately it did turn out to be (part) of the issue. I've
>> discovered that mixing char and varchar in a stored procedure does not
>> coerce the types, and ends up doing seq scans all the time.
>
> Oh, it coerces the type all right, just not in the direction you'd like.
>
> regression=# create table v (f1 varchar(32) primary key);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "v_pkey" for table "v"
> CREATE TABLE
> regression=# explain select * from v where f1 = 'abc'::varchar;
>                            QUERY PLAN
> -----------------------------------------------------------------
>  Index Scan using v_pkey on v  (cost=0.00..8.27 rows=1 width=34)
>    Index Cond: ((f1)::text = 'abc'::text)
> (2 rows)
>
> regression=# explain select * from v where f1 = 'abc'::char(3);
>                     QUERY PLAN
> ---------------------------------------------------
>  Seq Scan on v  (cost=0.00..25.88 rows=1 width=34)
>    Filter: ((f1)::bpchar = 'abc'::character(3))
> (2 rows)

yeah. not terribly helpful.. you'd have to assume I'm not the only one
this has bitten..

is there a reason it doesn't coerce to a type that's useful to the
planner (ie varchar in my case), or the planner doesn't accept any type
of string as a valid match for index scan? I would think the benefits of
being able to index scan always outweigh the cost of type conversion...


hmm. I only saw this with stored procs, but it's obviously generic. I
think the reason I didn't see it with straight SQL or views is that it
seems to work correctly with string constants.. coercing them to the
correct type for the index scan. with a stored proc, all the constants
are passed in as args, with char() type (until I fixed it, obviously!)


Attachment

Re: varchar index joins not working?

From
Tom Lane
Date:
Adam Gundy <adam@starsilk.net> writes:
> Tom Lane wrote:
>> Oh, it coerces the type all right, just not in the direction you'd like.

> is there a reason it doesn't coerce to a type that's useful to the
> planner (ie varchar in my case),

In this case I think the choice is probably semantically correct:
shouldn't a comparison of varchar (trailing space sensitive) and
char (trailing space INsensitive) follow trailing-space-insensitive
semantics?

I wouldn't swear that the behavior is intentional ;-) as to going
that way rather than the other, but I'm disinclined to change it.

> or the planner doesn't accept any type
> of string as a valid match for index scan?

Can't.  This equality operator doesn't have the same notion of equality
that that index does.

The long and the short of it is that mixing char and varchar is
hazardous.

            regards, tom lane

Re: varchar index joins not working?

From
Adam Gundy
Date:
Tom Lane wrote:
> Adam Gundy <adam@starsilk.net> writes:
>> Tom Lane wrote:
>>> Oh, it coerces the type all right, just not in the direction you'd like.
>
>> is there a reason it doesn't coerce to a type that's useful to the
>> planner (ie varchar in my case),
>
> In this case I think the choice is probably semantically correct:
> shouldn't a comparison of varchar (trailing space sensitive) and
> char (trailing space INsensitive) follow trailing-space-insensitive
> semantics?
>
> I wouldn't swear that the behavior is intentional ;-) as to going
> that way rather than the other, but I'm disinclined to change it.

ahh. I forgot about the trailing spaces. but you can always coerce a
char to a varchar safely, which would have fixed my issue. you can't
coerce the other way, as you say, because you'll lose the trailing spaces...


alternatively, can the planner give warnings somehow? or suggestions? eg
some messages from 'explain analyze' like:

   'I could make your query go much faster IF ...'

or

   'suggestion: create an index on ...'
   'suggestion: convert this index to ...'

or

    'warning: I'd really like to use this index, BUT ...'

>> or the planner doesn't accept any type
>> of string as a valid match for index scan?
>
> Can't.  This equality operator doesn't have the same notion of equality
> that that index does.
>
> The long and the short of it is that mixing char and varchar is
> hazardous.

no kidding.


Attachment