Thread: Bug? 8.0 does not use partial index
Hi! Here's an odd thing. I use a partial index on a table: group_data CREATE TABLE group_data ( this_group_id integer NOT NULL, group_id integer -- ... ); create index foo on group_data(this_group_id) where group_id is null; there are approx 1 million tuples where this_group_id=46, but only 4 (four) where group_id is null. So I would expect this query to use the index: select * from group_data where this_group_id=46 and group_id is null. On 7.4.5, it uses the index, but on 8.0rc5, it does not: 7.4.5=# explain analyze select * from group_data where group_id is null and this_group_id = 46; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------Index Scanusing foo on group_data (cost=0.00..40383.21 rows=108786 width=43) (actual time=0.154..0.176 rows=4 loops=1) Index Cond: (this_group_id = 46) Filter: (group_id IS NULL)Total runtime:0.241 ms (4 rows) 8.0.0rc5=# explain analyze select * from group_data where group_id is null and this_group_id = 46; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------Seq Scanon group_data (cost=0.00..140180.91 rows=211378 width=45) (actual time=383.689..32991.424 rows=4 loops=1) Filter: ((group_id IS NULL) AND (this_group_id = 46))Total runtime: 32991.469ms (3 rows) Time: 32992.812 ms This is bad. But it gets worse: 8.0.0rc5=# explain analyze select * from group_data where group_id is null and this_group_id = 46 and this_group_id = 46; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------Index Scanusing group_data_tgid_gidnull_idx on group_data (cost=0.00..145622.85 rows=78985 width=45) (actual time=0.033..0.039 rows=4 loops=1) Index Cond: ((this_group_id = 46) AND (this_group_id = 46)) Filter: (group_id IS NULL)Total runtime: 0.086 ms (4 rows) Time: 1.912 ms Don't tell me this is not a bug? this_group_id = 46 and this_group_id = 46 seems like a pretty odd way to get a query to use an index? Need more specific info, please mail me! Regards, Palle
> create index foo on group_data(this_group_id) where group_id is null; Try this instead; create index foo on group_data(this_group_id) where nullvalue(group_id); And Select * from group_data where this_group_id = 46 and nullvalue(group_id); ... John
Palle Girgensohn <girgen@pingpong.net> writes: > On 7.4.5, it uses the index, but on 8.0rc5, it does not: Have you ANALYZEd the 8.0 table lately? Those rowcount estimates look mighty far off. regards, tom lane
Yes, they are analyzed and vacuumed. How do you mean they look far off? The data in the two db:s where not identical in the example i sent. With identical data in both 7.4.5 and 8.0.0rc5 (both freshly pg_restored and vacuum analyzed), 7.4.5 used the index, and for 8.0.0rc5, when I add the this_group_id=46 *three times*, it decides to use the index. So, this might be a special case, but with more data, expected into the system shortly, the query takes 30 secs on 8.0.0rc5 and 12 ms on 7.4.5. That's a factor of 2000, which is too much for me :( If you want, I can send you the data. orig=# create index foo on group_data(this_group_id) where group_id is null; CREATE INDEX Time: 2240.438 ms kthorig=# vacuum analyze group_data; VACUUM Time: 13222.171 ms kthorig=# explain analyze select * from group_data where group_id is null and this_group_id = 46; QUERY PLAN -------------------------------------------------------------------------------------------------------------------Seq Scanon group_data (cost=0.00..47544.43 rows=114164 width=43) (actual time=114.015..1334.479 rows=4 loops=1) Filter: ((group_id IS NULL) AND (this_group_id = 46))Total runtime: 1334.526 ms (3 rows) Time: 1335.794 ms orig=# explain analyze select * from group_data where group_id is null and this_group_id = 46 and this_group_id = 46; QUERY PLAN ------------------------------------------------------------------------------------------------------------------Seq Scanon group_data (cost=0.00..52953.91 rows=43230 width=43) (actual time=126.061..1344.729 rows=4 loops=1) Filter: ((group_id IS NULL) AND (this_group_id = 46) AND (this_group_id = 46))Total runtime: 1344.777 ms (3 rows) Time: 1345.684 ms orig=# explain analyze select * from group_data where group_id is null and this_group_id = 46 and this_group_id = 46 and this_group_id = 46 and this_group_id = 46; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------Index Scanusing foo on group_data (cost=0.00..11791.58 rows=6199 width=43) (actual time=12.455..12.465 rows=4 loops=1) Index Cond: ((this_group_id = 46) AND (this_group_id = 46) AND (this_group_id = 46) AND (this_group_id = 46)) Filter: (group_id IS NULL)Total runtime: 12.519 ms (4 rows) Time: 13.932 ms orig=# select count(this_group_id) from group_data where this_group_id=46;count --------797426 (1 row) Time: 1843.869 ms orig=# select count(this_group_id) from group_data where this_group_id=46 and group_id is null;count ------- 4 (1 row) Time: 1647.350 ms ====================================================0 same thing on 7.4.5: kth=# create index foo on group_data(this_group_id) where group_id is null; CREATE INDEX kth=# vacuum analyze group_data; \timing VACUUM kth=# \timing Timing is on. kth=# explain analyze select * from group_data where group_id is null and this_group_id = 46; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------Index Scanusing foo on group_data (cost=0.00..40408.72 rows=109317 width=43) (actual time=0.154..0.175 rows=4 loops=1) Index Cond: (this_group_id = 46) Filter: (group_id IS NULL)Total runtime:0.241 ms (4 rows) Time: 2,785 ms kth=# explain analyze select * from group_data where group_id is null and this_group_id = 46 and this_group_id = 46; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------Index Scanusing foo on group_data (cost=0.00..40408.72 rows=109317 width=43) (actual time=0.033..0.054 rows=4 loops=1) Index Cond: (this_group_id = 46) Filter: (group_id IS NULL)Total runtime:0.121 ms (4 rows) Time: 1,607 ms kth=# explain analyze select * from group_data where group_id is null and this_group_id = 46 and this_group_id = 46 and this_group_id = 46 and this_group_id = 46; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------Index Scanusing foo on group_data (cost=0.00..40408.72 rows=109317 width=43) (actual time=0.033..0.055 rows=4 loops=1) Index Cond: (this_group_id = 46) Filter: (group_id IS NULL)Total runtime:0.119 ms (4 rows) Time: 1,702 ms kth=# select count(this_group_id) from group_data where this_group_id=46;count --------797426 (1 row) Time: 1821,433 ms kth=# select count(this_group_id) from group_data where this_group_id=46 and group_id is null;count ------- 4 (1 row) Time: 1,635 ms /Palle --On torsdag, januari 13, 2005 16.33.58 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Palle Girgensohn <girgen@pingpong.net> writes: >> On 7.4.5, it uses the index, but on 8.0rc5, it does not: > > Have you ANALYZEd the 8.0 table lately? Those rowcount estimates look > mighty far off. > > regards, tom lane
Thanks, but the behaviour seems identical. :( /Palle --On fredag, januari 14, 2005 07.37.46 +1100 John Hansen <john@geeknet.com.au> wrote: >> create index foo on group_data(this_group_id) where group_id is null; > > Try this instead; > > create index foo on group_data(this_group_id) where nullvalue(group_id); > > And > > Select * from group_data where this_group_id = 46 and nullvalue(group_id); > > ... John
Palle Girgensohn <girgen@pingpong.net> writes: > How do you mean they look far off? > Seq Scan on group_data (cost=0.00..47544.43 rows=114164 width=43) (actual > time=114.015..1334.479 rows=4 loops=1) 114164 estimated vs 4 actual rows is pretty far off. Perhaps something skewed about the data distribution? > If you want, I can send you the data. If it's not too huge, sure, send it to me off-list. regards, tom lane
> Thanks, but the behaviour seems identical. :( odd tho, that I was never able to get null values indexed (index was never used) unless I used this approach.... hmmmm on the other hand,.... just realised youre not actually indexing null values,... here, is null is the qualifier for the partial index.... as tom suggested, try vacuum analyze on that table... ... JOhn
> If you want, I can send you the data. if you can make available for download somewhere, a dump of the schema and data, I won't mind having a go at it... ... John
--On torsdag, januari 13, 2005 17.03.41 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Palle Girgensohn <girgen@pingpong.net> writes: >> How do you mean they look far off? > >> Seq Scan on group_data (cost=0.00..47544.43 rows=114164 width=43) >> (actual time=114.015..1334.479 rows=4 loops=1) > > 114164 estimated vs 4 actual rows is pretty far off. Perhaps something > skewed about the data distribution? Well, it might seem strange, but it is a quite normal data distribution for this application, believe me. >> If you want, I can send you the data. > > If it's not too huge, sure, send it to me off-list. I'm doing that now. /Palle
Palle Girgensohn <girgen@pingpong.net> writes: > --On torsdag, januari 13, 2005 18.18.37 -0500 Tom Lane <tgl@sss.pgh.pa.us> > wrote: >> So there's something nuts about the statistics in this case. On looking into it, it's the same old issue of not having column correlation statistics. pg_stats shows that ANALYZE estimated the fraction of rows with null group_id as 0.137667 (versus exact value of 0.147, not too bad considering I used the default statistics target) and it estimated the fraction with this_group_id = 46 as 0.358 (vs actual 0.369, ditto). The problem is that it then estimates the total selectivity as 0.137667 * 0.358 or a bit under 5%, much too high to make an indexscan sensible. In reality there are only 4 rows with this combination of values, but the planner has no way to know that. > Anything I can do about it? I thought of a fairly miserable hack, which relies on the fact that 8.0 does know how to accumulate statistics on functional indexes: group=# create index fooi on group_data (abs(this_group_id)) WHERE group_id IS NULL; CREATE INDEX group=# analyze group_data; ANALYZE group=# explain select * from group_data where group_id is null and abs(this_group_id) = 46; QUERY PLAN ------------------------------------------------------------------------------Index Scan using fooi on group_data (cost=0.00..5302.60rows=1802 width=42) Index Cond: (abs(this_group_id) = 46) Filter: (group_id IS NULL) (3 rows) (The choice of abs() is arbitrary, it just has to be something other than the unadorned column.) In this situation the planner will look at the stats for the functional index and discover that in that index there aren't many 46's, so it comes out with a more reasonable rowcount estimate. We should probably make it accumulate stats on partial indexes even when the index columns aren't expressions. This example shows that useful stats can be derived that way. Too late for 8.0 though... regards, tom lane
I wrote: > I thought of a fairly miserable hack, which relies on the fact that 8.0 > does know how to accumulate statistics on functional indexes: Never mind, it turns out that doesn't work the way I thought. It's actually falling back to a default estimate :-(. I still think it'd be a good idea to use stats on partial indexes in future releases, but right at the moment we aren't doing any such thing. Here's an even more miserable hack: use a non-partial functional index over a multicolumn expression as a poor man's way of creating cross-column stats. For example, assuming all this_group_id values are positive: group=# create function myfunc(int,int) returns int as group-# 'SELECT CASE WHEN $2 IS NULL THEN $1 ELSE -$1 END' language sql immutable; group=# create index fooi2 on group_data (myfunc(this_group_id, group_id)); CREATE INDEX group=# analyze group_data; ANALYZE group=# explain analyze select * from group_data where myfunc(this_group_id, group_id)=46; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------Index Scanusing fooi2 on group_data (cost=0.00..2948.85 rows=792 width=43) (actual time=0.171..0.198 rows=4 loops=1) Index Cond:(CASE WHEN (group_id IS NULL) THEN this_group_id ELSE (- this_group_id) END = 46)Total runtime: 0.304 ms (3 rows) Dunno if you're desperate enough to try that ... but it does seem to work. regards, tom lane
--On torsdag, januari 13, 2005 18.55.11 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Palle Girgensohn <girgen@pingpong.net> writes: >> --On torsdag, januari 13, 2005 18.18.37 -0500 Tom Lane >> <tgl@sss.pgh.pa.us> wrote: >>> So there's something nuts about the statistics in this case. > > On looking into it, it's the same old issue of not having column > correlation statistics. pg_stats shows that ANALYZE estimated the > fraction of rows with null group_id as 0.137667 (versus exact value > of 0.147, not too bad considering I used the default statistics target) > and it estimated the fraction with this_group_id = 46 as 0.358 > (vs actual 0.369, ditto). The problem is that it then estimates the > total selectivity as 0.137667 * 0.358 or a bit under 5%, much too high > to make an indexscan sensible. In reality there are only 4 rows with > this combination of values, but the planner has no way to know that. > >> Anything I can do about it? > > I thought of a fairly miserable hack, which relies on the fact that 8.0 > does know how to accumulate statistics on functional indexes: > > group=# create index fooi on group_data (abs(this_group_id)) WHERE > group_id IS NULL; CREATE INDEX > group=# analyze group_data; > ANALYZE > group=# explain select * from group_data where group_id is null and > abs(this_group_id) = 46; QUERY PLAN > ------------------------------------------------------------------------- > ----- Index Scan using fooi on group_data (cost=0.00..5302.60 rows=1802 > width=42) Index Cond: (abs(this_group_id) = 46) > Filter: (group_id IS NULL) > (3 rows) > > (The choice of abs() is arbitrary, it just has to be something other > than the unadorned column.) In this situation the planner will look at > the stats for the functional index and discover that in that index there > aren't many 46's, so it comes out with a more reasonable rowcount > estimate. OK, I think I understand. And this is changed between 7.4.x and 8.0? > We should probably make it accumulate stats on partial indexes even when > the index columns aren't expressions. This example shows that useful > stats can be derived that way. Too late for 8.0 though... True, but for next version, perhaps? :) Trying all this out, I realize that on 7.4.5, I can sometimes get different results after `vacuum analyze' vs. a plain `analyze' (again, not exactly the same data, and I cannot reproduce this on the other machine with the data I sent you). It does not really relate to the question above, but perhaps you can explain how come I get different results? I join with a table person, group_data.item_text has person.userid as foreign key constraint: 7.4.5: pp=# vacuum analyze group_data; VACUUM Time: 256353,802 ms pp=# select pp-# distinct p.last_name, pp-# p.userid pp-# from pp-# group_data gd join person p on (p.userid = gd.item_text) pp-# where pp-# gd.this_group_id = 46 pp-# and gd.group_id is null;last_name | userid -----------+----------Lastname | u1wmd5nn (1 row) Time: 6223,123 ms pp=# explain analyze pp-# select pp-# distinct p.last_name, pp-# p.userid pp-# from pp-# group_data gd join person p on (p.userid = gd.item_text) pp-# where pp-# gd.this_group_id = 46 pp-# and gd.group_id is null; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------Unique (cost=76016.57..77215.19 rows=92632 width=23) (actual time=7649.496..7649.512 rows=1 loops=1) -> Sort (cost=76016.57..76416.11 rows=159816 width=23) (actual time=7649.481..7649.487 rows=1 loops=1) Sort Key: p.last_name, p.userid -> Hash Join (cost=3003.90..62203.64rows=159816 width=23) (actual time=7649.254..7649.435 rows=1 loops=1) Hash Cond: ("outer".item_text = "inner".userid) -> Seq Scan on group_data gd (cost=0.00..53238.10 rows=160565 width=12) (actual time=431.078..5927.410 rows=5 loops=1) Filter: ((this_group_id = 46) AND(group_id IS NULL)) -> Hash (cost=2229.32..2229.32 rows=92632 width=23) (actual time=1555.797..1555.797 rows=0 loops=1) -> Seq Scan on person p (cost=0.00..2229.32 rows=92632 width=23) (actual time=0.093..856.728 rows=92632 loops=1)Total runtime: 7652.771 ms (10 rows) Time: 7656,909 ms pp=# select * from group_data where this_group_id=46 and group_id is null;this_group_id | group_id | item_text | item_int| link_path ---------------+----------+-----------+----------+----------- 46 | | | 1223 | :46: 46 | | | 1228 | :46: 46 | | | 1328 | :46: 46 | | | 1391 | :46: 46 | | u1wmd5nn | | :46: (5 rows) Time: 5891,716 ms pp=# analyze group_data; ANALYZE Time: 3210,096 ms pp=# explain select pp-# distinct p.last_name, pp-# p.userid pp-# from pp-# group_data gd join person p on (p.userid = gd.item_text) pp-# where pp-# gd.this_group_id = 46 pp-# and gd.group_id is null; QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------Unique (cost=111417.47..113761.30 rows=92632 width=23) -> Sort (cost=111417.47..112198.75 rows=312510 width=23) Sort Key:p.last_name, p.userid -> Hash Join (cost=3003.90..79231.40 rows=312510 width=23) Hash Cond: ("outer".item_text= "inner".userid) -> Index Scan using group_data_tgid_gidnull_idx on group_data gd (cost=0.00..65091.35 rows=275225 width=11) Index Cond: (this_group_id = 46) Filter: (group_id IS NULL) -> Hash (cost=2229.32..2229.32 rows=92632 width=23) -> Seq Scan on person p (cost=0.00..2229.32 rows=92632 width=23) (10 rows) Time: 6,647 ms pp=# select pp-# distinct p.last_name, pp-# p.userid pp-# from pp-# group_data gd join person p on (p.userid = gd.item_text) pp-# where pp-# gd.this_group_id = 46 pp-# and gd.group_id is null;last_name | userid -----------+----------Lastname | u1wmd5nn (1 row) Time: 772,969 ms pp=# select pp-# distinct p.last_name, pp-# p.userid pp-# from pp-# person p, group_data gd pp-# where pp-# p.userid = gd.item_text pp-# and gd.this_group_id = 46 pp-# and gd.group_id is null pp-# ;last_name | userid -----------+----------Lastname | u1wmd5nn (1 row) Time: 720,345 ms /Palle
Palle Girgensohn <girgen@pingpong.net> writes: > Trying all this out, I realize that on 7.4.5, I can sometimes get different > results after `vacuum analyze' vs. a plain `analyze' (again, not exactly > the same data, and I cannot reproduce this on the other machine with the > data I sent you). It does not really relate to the question above, but > perhaps you can explain how come I get different results? No surprise. vacuum analyze produces an exact total row count, whereas analyze can only produce an approximate total row count (since it only samples the table rather than groveling over every row). Sometimes the approximate count will be far enough off to affect the estimates. regards, tom lane
--On torsdag, januari 13, 2005 19.44.57 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Palle Girgensohn <girgen@pingpong.net> writes: >> Trying all this out, I realize that on 7.4.5, I can sometimes get >> different results after `vacuum analyze' vs. a plain `analyze' (again, >> not exactly the same data, and I cannot reproduce this on the other >> machine with the data I sent you). It does not really relate to the >> question above, but perhaps you can explain how come I get different >> results? > > No surprise. vacuum analyze produces an exact total row count, whereas > analyze can only produce an approximate total row count (since it only > samples the table rather than groveling over every row). Sometimes the > approximate count will be far enough off to affect the estimates. Reasonable. Thanks for clarifying that. In the normal case, vacuum analyze is better, I guess? /Palle
--On torsdag, januari 13, 2005 19.32.38 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> I thought of a fairly miserable hack, which relies on the fact that 8.0 >> does know how to accumulate statistics on functional indexes: > > Never mind, it turns out that doesn't work the way I thought. It's > actually falling back to a default estimate :-(. I still think it'd > be a good idea to use stats on partial indexes in future releases, > but right at the moment we aren't doing any such thing. Oh, uh, too bad :( > Here's an even more miserable hack: use a non-partial functional index > over a multicolumn expression as a poor man's way of creating > cross-column stats. For example, assuming all this_group_id values are > positive: [snip] > Dunno if you're desperate enough to try that ... but it does seem to work. Interesting. Yes, I might be that desperate, actually. As desperate as 30 s vs 30 ms can get me... :) And this one would actually accumulate stats alright? Looks kinda hairy to me. It's not really a partial index anymore, but I guess that doesn't matter... How is the behaviour on 7.4 for this stuff? It seems 7.4 does use the partial index, but not always, as shown in a previous mail. Will this example work on both versions, or just for 8.0? BTW, shall I file some kind of bug report / feature request about accumulating stats for partial indices etc... Perhaps you're best equipped to file such a report ;-) ? Reagards, Palle
> > Dunno if you're desperate enough to try that ... but it does seem to work. if yo're going to hack anyway, then why not just simply tell the planner that you know better and that it should use the index, regardles of the stats collected? set enable_seqscan=false; <your original query here>; set enable_seqscan=true; ... JOhn
--On fredag, januari 14, 2005 11.52.38 +1100 John Hansen <john@geeknet.com.au> wrote: >> > Dunno if you're desperate enough to try that ... but it does seem to >> > work. > > if yo're going to hack anyway, then why not just simply tell the planner > that you know better and that it should use the index, regardles of the > stats collected? > > set enable_seqscan=false; > <your original query here>; > set enable_seqscan=true; yeah, maybe I'll do that, but I stumbled on at least one more complicated query that lost performance from disabling seq_scans. it had a union with a query that gains performance from disabling seq_scans... heh... Thanks for your time, guys! I now have some options to move on. I would love to see this working better in a future version of postgresql, thought ;-) Regards, Palle
Palle Girgensohn <girgen@pingpong.net> writes: > Interesting. Yes, I might be that desperate, actually. As desperate as 30 s > vs 30 ms can get me... :) > And this one would actually accumulate stats alright? Looks kinda hairy to > me. It's not really a partial index anymore, but I guess that doesn't > matter... Yeah, it would. The trick is finding a mapping function that will map all the cases you care about to distinct values. > How is the behaviour on 7.4 for this stuff? It seems 7.4 does use the > partial index, but not always, as shown in a previous mail. Will this > example work on both versions, or just for 8.0? 7.4 doesn't keep stats on functional indexes, so the hack is a nonstarter there. I am not sure why you are seeing different results on 7.4 than 8.0 for the original example --- 7.4 is certainly not smarter than 8.0, and we seem to have ruled out the idea that some sort of glitch is confusing the 8.0 planner. regards, tom lane
John Hansen <john@geeknet.com.au> writes: > > Thanks, but the behaviour seems identical. :( > > odd tho, that I was never able to get null values indexed (index was > never used) unless I used this approach.... You're mixing up the indexed column with the where clause of a partial index. They behave differently. Null values *are* normally indexed in Postgres. The problem is that the optimizer doesn't recognize IS NULL as an indexable operation, so they don't always help unless you do something like you describe above. But the WHERE clause on partial indexes is another story. The optimizer recognizes IS NULL as being equivalent to IS NULL so it recognizes that the partial index is usable just fine. -- greg
Palle Girgensohn <girgen@pingpong.net> writes: >> No surprise. vacuum analyze produces an exact total row count, whereas >> analyze can only produce an approximate total row count (since it only >> samples the table rather than groveling over every row). Sometimes the >> approximate count will be far enough off to affect the estimates. > Reasonable. Thanks for clarifying that. In the normal case, vacuum analyze > is better, I guess? If you intend to do both steps, the combined command is definitely better than issuing them separately. I wouldn't say that you need to do the combined command in situations where you'd otherwise do just one. regards, tom lane