Thread: Bug? 8.0 does not use partial index

Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

13 January 2005, 20:23:19

Hi!

Here's an odd thing. I use a partial index on a table:

group_data
CREATE TABLE group_data (   this_group_id integer NOT NULL,   group_id integer
-- ...
);

create index foo on group_data(this_group_id) where group_id is null;

there are approx 1 million tuples where this_group_id=46, but only 4 (four) 
where group_id is null. So I would expect this query to use the index:

select * from group_data where this_group_id=46 and group_id is null.

On 7.4.5, it uses the index, but on 8.0rc5, it does not:

7.4.5=# explain analyze select * from group_data where group_id is null and 
this_group_id = 46;                                                       QUERY PLAN 


--------------------------------------------------------------------------------------------------------------------------Index
Scanusing foo on group_data  (cost=0.00..40383.21 rows=108786 
 
width=43) (actual time=0.154..0.176 rows=4 loops=1)  Index Cond: (this_group_id = 46)  Filter: (group_id IS NULL)Total
runtime:0.241 ms
 
(4 rows)


8.0.0rc5=# explain analyze select * from group_data where group_id is null 
and this_group_id = 46;                                                    QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------Seq
Scanon group_data  (cost=0.00..140180.91 rows=211378 width=45) 
 
(actual time=383.689..32991.424 rows=4 loops=1)  Filter: ((group_id IS NULL) AND (this_group_id = 46))Total runtime:
32991.469ms
 
(3 rows)

Time: 32992.812 ms


This is bad. But it gets worse:

8.0.0rc5=# explain analyze select * from group_data where group_id is null 
and this_group_id = 46 and this_group_id = 46;                                                                   QUERY

PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------Index
Scanusing group_data_tgid_gidnull_idx on group_data 
 
(cost=0.00..145622.85 rows=78985 width=45) (actual time=0.033..0.039 rows=4 
loops=1)  Index Cond: ((this_group_id = 46) AND (this_group_id = 46))  Filter: (group_id IS NULL)Total runtime: 0.086
ms
(4 rows)

Time: 1.912 ms


Don't tell me this is not a bug?

this_group_id = 46 and
this_group_id = 46

seems like a pretty odd way to get a query to use an index?

Need more specific info, please mail me!

Regards,
Palle

Re: Bug? 8.0 does not use partial index

From

"John Hansen"

Date:

13 January 2005, 20:51:33

> create index foo on group_data(this_group_id) where group_id is null;

Try this instead;

create index foo on group_data(this_group_id) where nullvalue(group_id);

And

Select * from group_data where this_group_id = 46 and nullvalue(group_id);

... John

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

13 January 2005, 21:38:30

Palle Girgensohn <girgen@pingpong.net> writes:
> On 7.4.5, it uses the index, but on 8.0rc5, it does not:

Have you ANALYZEd the 8.0 table lately?  Those rowcount estimates look
mighty far off.
        regards, tom lane

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

13 January 2005, 21:58:01

Yes, they are analyzed and vacuumed.

How do you mean they look far off? The data in the two db:s where not 
identical in the example i sent. With identical data in both 7.4.5 and 
8.0.0rc5 (both freshly pg_restored and vacuum analyzed), 7.4.5 used the 
index, and for 8.0.0rc5, when I add the this_group_id=46 *three times*, it 
decides to use the index. So, this might be a special case, but with more 
data, expected into the system shortly, the query takes 30 secs on 8.0.0rc5 
and 12 ms on 7.4.5. That's a factor of 2000, which is too much for me :(

If you want, I can send you the data.

orig=# create index foo on group_data(this_group_id) where group_id is null;
CREATE INDEX
Time: 2240.438 ms
kthorig=# vacuum analyze group_data;
VACUUM
Time: 13222.171 ms
kthorig=# explain analyze select * from group_data where group_id is null 
and this_group_id = 46;                                                   QUERY PLAN 

-------------------------------------------------------------------------------------------------------------------Seq
Scanon group_data  (cost=0.00..47544.43 rows=114164 width=43) (actual 
 
time=114.015..1334.479 rows=4 loops=1)  Filter: ((group_id IS NULL) AND (this_group_id = 46))Total runtime: 1334.526
ms
(3 rows)

Time: 1335.794 ms
orig=# explain analyze select * from group_data where group_id is null and 
this_group_id = 46  and this_group_id = 46;                                                   QUERY PLAN 

------------------------------------------------------------------------------------------------------------------Seq
Scanon group_data  (cost=0.00..52953.91 rows=43230 width=43) (actual 
 
time=126.061..1344.729 rows=4 loops=1)  Filter: ((group_id IS NULL) AND (this_group_id = 46) AND (this_group_id 
= 46))Total runtime: 1344.777 ms
(3 rows)

Time: 1345.684 ms
orig=# explain analyze select * from group_data where group_id is null and 
this_group_id = 46  and this_group_id = 46  and this_group_id = 46  and 
this_group_id = 46;                                                       QUERY PLAN 


--------------------------------------------------------------------------------------------------------------------------Index
Scanusing foo on group_data  (cost=0.00..11791.58 rows=6199 
 
width=43) (actual time=12.455..12.465 rows=4 loops=1)  Index Cond: ((this_group_id = 46) AND (this_group_id = 46) AND 
(this_group_id = 46) AND (this_group_id = 46))  Filter: (group_id IS NULL)Total runtime: 12.519 ms
(4 rows)

Time: 13.932 ms
orig=# select count(this_group_id) from group_data where this_group_id=46;count
--------797426
(1 row)

Time: 1843.869 ms
orig=# select count(this_group_id) from group_data where this_group_id=46 
and group_id is null;count
-------    4
(1 row)

Time: 1647.350 ms



====================================================0

same thing on 7.4.5:
kth=# create index foo on group_data(this_group_id) where group_id is null;
CREATE INDEX
kth=# vacuum analyze group_data;
\timing
VACUUM
kth=# \timing
Timing is on.
kth=# explain analyze select * from group_data where group_id is null and 
this_group_id = 46;                                                       QUERY PLAN 


--------------------------------------------------------------------------------------------------------------------------Index
Scanusing foo on group_data  (cost=0.00..40408.72 rows=109317 
 
width=43) (actual time=0.154..0.175 rows=4 loops=1)  Index Cond: (this_group_id = 46)  Filter: (group_id IS NULL)Total
runtime:0.241 ms
 
(4 rows)

Time: 2,785 ms
kth=# explain analyze select * from group_data where group_id is null and 
this_group_id = 46  and this_group_id = 46;                                                       QUERY PLAN 


--------------------------------------------------------------------------------------------------------------------------Index
Scanusing foo on group_data  (cost=0.00..40408.72 rows=109317 
 
width=43) (actual time=0.033..0.054 rows=4 loops=1)  Index Cond: (this_group_id = 46)  Filter: (group_id IS NULL)Total
runtime:0.121 ms
 
(4 rows)

Time: 1,607 ms
kth=# explain analyze select * from group_data where group_id is null and 
this_group_id = 46  and this_group_id = 46  and this_group_id = 46  and 
this_group_id = 46;                                                       QUERY PLAN 


--------------------------------------------------------------------------------------------------------------------------Index
Scanusing foo on group_data  (cost=0.00..40408.72 rows=109317 
 
width=43) (actual time=0.033..0.055 rows=4 loops=1)  Index Cond: (this_group_id = 46)  Filter: (group_id IS NULL)Total
runtime:0.119 ms
 
(4 rows)

Time: 1,702 ms
kth=#  select count(this_group_id) from group_data where this_group_id=46;count
--------797426
(1 row)

Time: 1821,433 ms
kth=# select count(this_group_id) from group_data where this_group_id=46 
and group_id is null;count
-------    4
(1 row)

Time: 1,635 ms

/Palle

--On torsdag, januari 13, 2005 16.33.58 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Palle Girgensohn <girgen@pingpong.net> writes:
>> On 7.4.5, it uses the index, but on 8.0rc5, it does not:
>
> Have you ANALYZEd the 8.0 table lately?  Those rowcount estimates look
> mighty far off.
>
>             regards, tom lane

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

13 January 2005, 22:01:32

Thanks, but the behaviour seems identical. :(

/Palle

--On fredag, januari 14, 2005 07.37.46 +1100 John Hansen 
<john@geeknet.com.au> wrote:

>> create index foo on group_data(this_group_id) where group_id is null;
>
> Try this instead;
>
> create index foo on group_data(this_group_id) where nullvalue(group_id);
>
> And
>
> Select * from group_data where this_group_id = 46 and nullvalue(group_id);
>
> ... John

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

13 January 2005, 22:04:41

Palle Girgensohn <girgen@pingpong.net> writes:
> How do you mean they look far off?

>  Seq Scan on group_data  (cost=0.00..47544.43 rows=114164 width=43) (actual 
> time=114.015..1334.479 rows=4 loops=1)

114164 estimated vs 4 actual rows is pretty far off.  Perhaps something
skewed about the data distribution?

> If you want, I can send you the data.

If it's not too huge, sure, send it to me off-list.
        regards, tom lane

Re: Bug? 8.0 does not use partial index

From

John Hansen

Date:

13 January 2005, 22:08:07

> Thanks, but the behaviour seems identical. :(

odd tho, that I was never able to get null values indexed (index was
never used) unless I used this approach....

hmmmm
on the other hand,.... just realised youre not actually indexing null
values,... here, is null is the qualifier for the partial index....

as tom suggested, try vacuum analyze on that table...

... JOhn

Re: Bug? 8.0 does not use partial index

From

John Hansen

Date:

13 January 2005, 22:09:51

> If you want, I can send you the data.

if you can make available for download somewhere, a dump of the schema
and data, I won't mind having a go at it...

... John

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

13 January 2005, 22:18:26

--On torsdag, januari 13, 2005 17.03.41 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Palle Girgensohn <girgen@pingpong.net> writes:
>> How do you mean they look far off?
>
>>  Seq Scan on group_data  (cost=0.00..47544.43 rows=114164 width=43)
>>  (actual  time=114.015..1334.479 rows=4 loops=1)
>
> 114164 estimated vs 4 actual rows is pretty far off.  Perhaps something
> skewed about the data distribution?

Well, it might seem strange, but it is a quite normal data distribution for 
this application, believe me.


>> If you want, I can send you the data.
>
> If it's not too huge, sure, send it to me off-list.

I'm doing that now.

/Palle

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

13 January 2005, 23:56:05

Palle Girgensohn <girgen@pingpong.net> writes:
> --On torsdag, januari 13, 2005 18.18.37 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
> wrote:
>> So there's something nuts about the statistics in this case.

On looking into it, it's the same old issue of not having column
correlation statistics.  pg_stats shows that ANALYZE estimated the
fraction of rows with null group_id as 0.137667 (versus exact value
of 0.147, not too bad considering I used the default statistics target)
and it estimated the fraction with this_group_id = 46 as 0.358
(vs actual 0.369, ditto).  The problem is that it then estimates the
total selectivity as 0.137667 * 0.358 or a bit under 5%, much too high
to make an indexscan sensible.  In reality there are only 4 rows with
this combination of values, but the planner has no way to know that.

> Anything I can do about it?

I thought of a fairly miserable hack, which relies on the fact that 8.0
does know how to accumulate statistics on functional indexes:

group=# create index fooi on group_data (abs(this_group_id)) WHERE group_id IS NULL;
CREATE INDEX
group=# analyze group_data;
ANALYZE
group=# explain select * from group_data where group_id is null and abs(this_group_id) = 46;
    QUERY PLAN

------------------------------------------------------------------------------Index Scan using fooi on group_data
(cost=0.00..5302.60rows=1802 width=42)  Index Cond: (abs(this_group_id) = 46)  Filter: (group_id IS NULL)

(3 rows)

(The choice of abs() is arbitrary, it just has to be something other
than the unadorned column.)  In this situation the planner will look at
the stats for the functional index and discover that in that index there
aren't many 46's, so it comes out with a more reasonable rowcount estimate.

We should probably make it accumulate stats on partial indexes even when
the index columns aren't expressions.  This example shows that useful
stats can be derived that way.  Too late for 8.0 though...
        regards, tom lane

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

14 January 2005, 00:33:37

I wrote:
> I thought of a fairly miserable hack, which relies on the fact that 8.0
> does know how to accumulate statistics on functional indexes:

Never mind, it turns out that doesn't work the way I thought.  It's
actually falling back to a default estimate :-(.  I still think it'd
be a good idea to use stats on partial indexes in future releases,
but right at the moment we aren't doing any such thing.

Here's an even more miserable hack: use a non-partial functional index
over a multicolumn expression as a poor man's way of creating
cross-column stats.  For example, assuming all this_group_id values are
positive:

group=# create function myfunc(int,int) returns int as
group-# 'SELECT CASE WHEN $2 IS NULL THEN $1 ELSE -$1 END' language sql immutable;
group=# create index fooi2 on group_data (myfunc(this_group_id, group_id));
CREATE INDEX
group=# analyze group_data;
ANALYZE
group=# explain analyze select * from group_data where myfunc(this_group_id, group_id)=46;
                       QUERY PLAN

------------------------------------------------------------------------------------------------------------------------Index
Scanusing fooi2 on group_data  (cost=0.00..2948.85 rows=792 width=43) (actual time=0.171..0.198 rows=4 loops=1)  Index
Cond:(CASE WHEN (group_id IS NULL) THEN this_group_id ELSE (- this_group_id) END = 46)Total runtime: 0.304 ms

(3 rows)

Dunno if you're desperate enough to try that ... but it does seem to work.
        regards, tom lane

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

14 January 2005, 00:42:32

--On torsdag, januari 13, 2005 18.55.11 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Palle Girgensohn <girgen@pingpong.net> writes:
>> --On torsdag, januari 13, 2005 18.18.37 -0500 Tom Lane
>> <tgl@sss.pgh.pa.us>  wrote:
>>> So there's something nuts about the statistics in this case.
>
> On looking into it, it's the same old issue of not having column
> correlation statistics.  pg_stats shows that ANALYZE estimated the
> fraction of rows with null group_id as 0.137667 (versus exact value
> of 0.147, not too bad considering I used the default statistics target)
> and it estimated the fraction with this_group_id = 46 as 0.358
> (vs actual 0.369, ditto).  The problem is that it then estimates the
> total selectivity as 0.137667 * 0.358 or a bit under 5%, much too high
> to make an indexscan sensible.  In reality there are only 4 rows with
> this combination of values, but the planner has no way to know that.
>
>> Anything I can do about it?
>
> I thought of a fairly miserable hack, which relies on the fact that 8.0
> does know how to accumulate statistics on functional indexes:
>
> group=# create index fooi on group_data (abs(this_group_id)) WHERE
> group_id IS NULL; CREATE INDEX
> group=# analyze group_data;
> ANALYZE
> group=# explain select * from group_data where group_id is null and
> abs(this_group_id) = 46;                                   QUERY PLAN
> -------------------------------------------------------------------------
> -----  Index Scan using fooi on group_data  (cost=0.00..5302.60 rows=1802
> width=42)    Index Cond: (abs(this_group_id) = 46)
>    Filter: (group_id IS NULL)
> (3 rows)
>
> (The choice of abs() is arbitrary, it just has to be something other
> than the unadorned column.)  In this situation the planner will look at
> the stats for the functional index and discover that in that index there
> aren't many 46's, so it comes out with a more reasonable rowcount
> estimate.

OK, I think I understand. And this is changed between 7.4.x and 8.0?

> We should probably make it accumulate stats on partial indexes even when
> the index columns aren't expressions.  This example shows that useful
> stats can be derived that way.  Too late for 8.0 though...

True, but for next version, perhaps? :)

Trying all this out, I realize that on 7.4.5, I can sometimes get different 
results after `vacuum analyze' vs. a plain `analyze' (again, not exactly 
the same data, and I cannot reproduce this on the other machine with the 
data I sent you). It does not really relate to the question above, but 
perhaps you can explain how come I get different results?

I join with a table person, group_data.item_text has person.userid as 
foreign key constraint:

7.4.5:

pp=# vacuum analyze group_data;
VACUUM
Time: 256353,802 ms
pp=# select
pp-#   distinct p.last_name,
pp-#   p.userid
pp-#  from
pp-#   group_data gd join person p on (p.userid = gd.item_text)
pp-#  where
pp-#   gd.this_group_id = 46
pp-#   and gd.group_id is null;last_name |  userid
-----------+----------Lastname  | u1wmd5nn
(1 row)

Time: 6223,123 ms
pp=# explain analyze
pp-# select
pp-#   distinct p.last_name,
pp-#   p.userid
pp-#  from
pp-#   group_data gd join person p on (p.userid = gd.item_text)
pp-#  where
pp-#   gd.this_group_id = 46
pp-#   and gd.group_id is null;                                                              QUERY PLAN 


----------------------------------------------------------------------------------------------------------------------------------------Unique
(cost=76016.57..77215.19 rows=92632 width=23) (actual 
 
time=7649.496..7649.512 rows=1 loops=1)  ->  Sort  (cost=76016.57..76416.11 rows=159816 width=23) (actual 
time=7649.481..7649.487 rows=1 loops=1)        Sort Key: p.last_name, p.userid        ->  Hash Join
(cost=3003.90..62203.64rows=159816 width=23) 
 
(actual time=7649.254..7649.435 rows=1 loops=1)              Hash Cond: ("outer".item_text = "inner".userid)
 ->  Seq Scan on group_data gd  (cost=0.00..53238.10 
 
rows=160565 width=12) (actual time=431.078..5927.410 rows=5 loops=1)                    Filter: ((this_group_id = 46)
AND(group_id IS NULL))              ->  Hash  (cost=2229.32..2229.32 rows=92632 width=23) 
 
(actual time=1555.797..1555.797 rows=0 loops=1)                    ->  Seq Scan on person p  (cost=0.00..2229.32 
rows=92632 width=23) (actual time=0.093..856.728 rows=92632 loops=1)Total runtime: 7652.771 ms
(10 rows)

Time: 7656,909 ms
pp=# select * from group_data where this_group_id=46 and group_id is null;this_group_id | group_id | item_text |
item_int| link_path
 
---------------+----------+-----------+----------+-----------           46 |          |           |     1223 | :46:
     46 |          |           |     1228 | :46:           46 |          |           |     1328 | :46:           46 |
      |           |     1391 | :46:           46 |          | u1wmd5nn  |          | :46:
 
(5 rows)

Time: 5891,716 ms
pp=# analyze group_data;
ANALYZE
Time: 3210,096 ms
pp=# explain select
pp-#   distinct p.last_name,
pp-#   p.userid
pp-#  from
pp-#   group_data gd join person p on (p.userid = gd.item_text)
pp-#  where
pp-#   gd.this_group_id = 46
pp-#   and gd.group_id is null;                                                        QUERY PLAN 


-----------------------------------------------------------------------------------------------------------------------------Unique
(cost=111417.47..113761.30 rows=92632 width=23)  ->  Sort  (cost=111417.47..112198.75 rows=312510 width=23)        Sort
Key:p.last_name, p.userid        ->  Hash Join  (cost=3003.90..79231.40 rows=312510 width=23)              Hash Cond:
("outer".item_text= "inner".userid)              ->  Index Scan using group_data_tgid_gidnull_idx on 
 
group_data gd  (cost=0.00..65091.35 rows=275225 width=11)                    Index Cond: (this_group_id = 46)
        Filter: (group_id IS NULL)              ->  Hash  (cost=2229.32..2229.32 rows=92632 width=23)
->  Seq Scan on person p  (cost=0.00..2229.32 
 
rows=92632 width=23)
(10 rows)

Time: 6,647 ms
pp=# select
pp-#   distinct p.last_name,
pp-#   p.userid
pp-#  from
pp-#   group_data gd join person p on (p.userid = gd.item_text)
pp-#  where
pp-#   gd.this_group_id = 46
pp-#   and gd.group_id is null;last_name |  userid
-----------+----------Lastname  | u1wmd5nn
(1 row)

Time: 772,969 ms
pp=# select
pp-#   distinct p.last_name,
pp-#   p.userid
pp-#  from
pp-#   person p, group_data gd
pp-#  where
pp-#   p.userid = gd.item_text
pp-#   and gd.this_group_id = 46
pp-#   and gd.group_id is null
pp-# ;last_name |  userid
-----------+----------Lastname  | u1wmd5nn
(1 row)

Time: 720,345 ms


/Palle

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

14 January 2005, 00:46:10

Palle Girgensohn <girgen@pingpong.net> writes:
> Trying all this out, I realize that on 7.4.5, I can sometimes get different 
> results after `vacuum analyze' vs. a plain `analyze' (again, not exactly 
> the same data, and I cannot reproduce this on the other machine with the 
> data I sent you). It does not really relate to the question above, but 
> perhaps you can explain how come I get different results?

No surprise.  vacuum analyze produces an exact total row count, whereas
analyze can only produce an approximate total row count (since it only
samples the table rather than groveling over every row).  Sometimes the
approximate count will be far enough off to affect the estimates.
        regards, tom lane

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

14 January 2005, 00:53:13


--On torsdag, januari 13, 2005 19.44.57 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Palle Girgensohn <girgen@pingpong.net> writes:
>> Trying all this out, I realize that on 7.4.5, I can sometimes get
>> different  results after `vacuum analyze' vs. a plain `analyze' (again,
>> not exactly  the same data, and I cannot reproduce this on the other
>> machine with the  data I sent you). It does not really relate to the
>> question above, but  perhaps you can explain how come I get different
>> results?
>
> No surprise.  vacuum analyze produces an exact total row count, whereas
> analyze can only produce an approximate total row count (since it only
> samples the table rather than groveling over every row).  Sometimes the
> approximate count will be far enough off to affect the estimates.

Reasonable. Thanks for clarifying that. In the normal case, vacuum analyze 
is better, I guess?

/Palle

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

14 January 2005, 00:53:38

--On torsdag, januari 13, 2005 19.32.38 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> I wrote:
>> I thought of a fairly miserable hack, which relies on the fact that 8.0
>> does know how to accumulate statistics on functional indexes:
>
> Never mind, it turns out that doesn't work the way I thought.  It's
> actually falling back to a default estimate :-(.  I still think it'd
> be a good idea to use stats on partial indexes in future releases,
> but right at the moment we aren't doing any such thing.

Oh, uh, too bad :(

> Here's an even more miserable hack: use a non-partial functional index
> over a multicolumn expression as a poor man's way of creating
> cross-column stats.  For example, assuming all this_group_id values are
> positive:

[snip]

> Dunno if you're desperate enough to try that ... but it does seem to work.

Interesting. Yes, I might be that desperate, actually. As desperate as 30 s 
vs 30 ms can get me... :)

And this one would actually accumulate stats alright? Looks kinda hairy to 
me. It's not really a partial index anymore, but I guess that doesn't 
matter...

How is the behaviour on 7.4 for this stuff? It seems 7.4 does use the 
partial index, but not always, as shown in a previous mail. Will this 
example work on both versions, or just for 8.0?

BTW, shall I file some kind of bug report / feature request about 
accumulating stats for partial indices etc... Perhaps you're best equipped 
to file such a report ;-) ?

Reagards,
Palle

Re: Bug? 8.0 does not use partial index

From

John Hansen

Date:

14 January 2005, 00:57:07

> > Dunno if you're desperate enough to try that ... but it does seem to work.

if yo're going to hack anyway, then why not just simply tell the planner
that you know better and that it should use the index, regardles of the
stats collected?

set enable_seqscan=false;
<your original query here>;
set enable_seqscan=true;

... JOhn

Re: Bug? 8.0 does not use partial index

From

Palle Girgensohn

Date:

14 January 2005, 01:26:54

--On fredag, januari 14, 2005 11.52.38 +1100 John Hansen 
<john@geeknet.com.au> wrote:

>> > Dunno if you're desperate enough to try that ... but it does seem to
>> > work.
>
> if yo're going to hack anyway, then why not just simply tell the planner
> that you know better and that it should use the index, regardles of the
> stats collected?
>
> set enable_seqscan=false;
> <your original query here>;
> set enable_seqscan=true;

yeah, maybe I'll do that, but I stumbled on at least one more complicated 
query that lost performance from disabling seq_scans. it had a union with a 
query that gains performance from disabling seq_scans... heh...

Thanks for your time, guys! I now have some options to move on. I would 
love to see this working better in a future version of postgresql, thought 
;-)

Regards,
Palle

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

14 January 2005, 01:28:47

Palle Girgensohn <girgen@pingpong.net> writes:
> Interesting. Yes, I might be that desperate, actually. As desperate as 30 s 
> vs 30 ms can get me... :)

> And this one would actually accumulate stats alright? Looks kinda hairy to 
> me. It's not really a partial index anymore, but I guess that doesn't 
> matter...

Yeah, it would.  The trick is finding a mapping function that will map
all the cases you care about to distinct values.

> How is the behaviour on 7.4 for this stuff? It seems 7.4 does use the 
> partial index, but not always, as shown in a previous mail. Will this 
> example work on both versions, or just for 8.0?

7.4 doesn't keep stats on functional indexes, so the hack is a
nonstarter there.

I am not sure why you are seeing different results on 7.4 than 8.0 for
the original example --- 7.4 is certainly not smarter than 8.0, and we
seem to have ruled out the idea that some sort of glitch is confusing
the 8.0 planner.
        regards, tom lane

Re: Bug? 8.0 does not use partial index

From

Greg Stark

Date:

14 January 2005, 04:50:56

John Hansen <john@geeknet.com.au> writes:

> > Thanks, but the behaviour seems identical. :(
> 
> odd tho, that I was never able to get null values indexed (index was
> never used) unless I used this approach....

You're mixing up the indexed column with the where clause of a partial index.
They behave differently. 

Null values *are* normally indexed in Postgres. The problem is that the
optimizer doesn't recognize IS NULL as an indexable operation, so they don't
always help unless you do something like you describe above.

But the WHERE clause on partial indexes is another story. The optimizer
recognizes IS NULL as being equivalent to IS NULL so it recognizes that the
partial index is usable just fine.

-- 
greg

Re: Bug? 8.0 does not use partial index

From

Tom Lane

Date:

14 January 2005, 04:50:57

Palle Girgensohn <girgen@pingpong.net> writes:
>> No surprise.  vacuum analyze produces an exact total row count, whereas
>> analyze can only produce an approximate total row count (since it only
>> samples the table rather than groveling over every row).  Sometimes the
>> approximate count will be far enough off to affect the estimates.

> Reasonable. Thanks for clarifying that. In the normal case, vacuum analyze 
> is better, I guess?

If you intend to do both steps, the combined command is definitely
better than issuing them separately.  I wouldn't say that you need to
do the combined command in situations where you'd otherwise do just
one.
        regards, tom lane