Thread: strange query filter problems
Hi, I have a problem with a slow query. (I have run vacuum full analyze!) It seems that the query below works OK because the query planner filters on the date first. It takes about 0.3 sec: EXPLAIN SELECT * FROM sskjema s INNER JOIN tskjema t USING(sskjema_pkey) WHERE t.species::char(12) LIKE 'TAGGMAKRELL%'::char(12) AND s.date >=20050101 "Merge Join (cost=6.02..3899.33 rows=1 width=228)" " Merge Cond: ("outer".sskjema_pkey = "inner".sskjema_pkey)" " -> Index Scan using sskjema_pkey on sskjema s (cost=0.00..3868.95 rows=9738 width=157)" " Filter: (date >= 20050101)" " -> Sort (cost=6.02..6.03 rows=1 width=75)" " Sort Key: t.sskjema_pkey" " -> Index Scan using speciesix on tskjema t (cost=0.00..6.01 rows=1 width=75)" " Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))" " Filter: (species ~~ 'TAGGMAKRELL%'::text)" However, if I add a second date-condition to further cut the result-size, the species-column is used as the first filter, drasticly increasing the query-time to more than a minute: EXPLAIN SELECT * FROM sskjema s INNER JOIN tskjema t USING(sskjema_pkey) WHERE t.species::char(12) LIKE 'TAGGMAKRELL%'::char(12) AND s.date >=20050101 AND s.date <=20051231 "Nested Loop (cost=0.00..4049.18 rows=1 width=228)" " Join Filter: ("inner".sskjema_pkey = "outer".sskjema_pkey)" " -> Index Scan using speciesix on tskjema t (cost=0.00..6.01 rows=1 width=75)" " Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))" " Filter: (species ~~ 'TAGGMAKRELL%'::text)" " -> Index Scan using dateix on sskjema s (cost=0.00..4025.13 rows=1443 width=157)" " Index Cond: ((date >= 20050101) AND (date <= 20051231))" Any suggestions how to get the planner do the query in the best way? regards Jonas:))
On Wed, Apr 19, 2006 at 01:27:45PM +0200, Jonas Henriksen wrote: > Hi, > I have a problem with a slow query. (I have run vacuum full analyze!) > It seems that the query below works OK because the query planner > filters on the date first. It takes about 0.3 sec: Can we see an EXPLAIN ANALYZE of the two queries, so we can see what's actually happening? My guess is that in the second case, the index scan on sskjema matches more rows than it expects... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Yes, explain analyze looks like this: EXPLAIN ANALYZE SELECT * FROM sskjema s inner join tskjema t using(sskjema_pkey) where t.species::char(12) like 'TAGGMAKRELL%'::char(12) and s.date >=20050101 "Merge Join (cost=6.02..3899.33 rows=1 width=228) (actual time=150.274..331.782 rows=190 loops=1)" " Merge Cond: ("outer".sskjema_pkey = "inner".sskjema_pkey)" " -> Index Scan using sskjema_pkey on sskjema s (cost=0.00..3868.95 rows=9738 width=157) (actual time=104.465..208.185 rows=14417 loops=1)" " Filter: (date >= 20050101)" " -> Sort (cost=6.02..6.03 rows=1 width=75) (actual time=34.693..40.956 rows=1703 loops=1)" " Sort Key: t.sskjema_pkey" " -> Index Scan using speciesix on tskjema t (cost=0.00..6.01 rows=1 width=75) (actual time=0.140..24.594 rows=1703 loops=1)" " Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))" " Filter: (species ~~ 'TAGGMAKRELL%'::text)" "Total runtime: 333.158 ms" EXPLAIN ANALYZE SELECT * FROM sskjema s inner join tskjema t using(sskjema_pkey) where t.species::char(12) like 'TAGGMAKRELL%'::char(12) and s.date >=20050101 and s.date <=20051231 "Nested Loop (cost=0.00..4049.18 rows=1 width=228) (actual time=1260.988..252110.934 rows=150 loops=1)" " Join Filter: ("inner".sskjema_pkey = "outer".sskjema_pkey)" " -> Index Scan using speciesix on tskjema t (cost=0.00..6.01 rows=1 width=75) (actual time=0.256..50.875 rows=1703 loops=1)" " Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))" " Filter: (species ~~ 'TAGGMAKRELL%'::text)" " -> Index Scan using dateix on sskjema s (cost=0.00..4025.13 rows=1443 width=157) (actual time=0.026..76.451 rows=14340 loops=1703)" " Index Cond: ((date >= 20050101) AND (date <= 20051231))" "Total runtime: 252111.940 ms" Jonas:)) On 4/19/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Wed, Apr 19, 2006 at 01:27:45PM +0200, Jonas Henriksen wrote: > > Hi, > > I have a problem with a slow query. (I have run vacuum full analyze!) > > It seems that the query below works OK because the query planner > > filters on the date first. It takes about 0.3 sec: > > Can we see an EXPLAIN ANALYZE of the two queries, so we can see what's > actually happening? My guess is that in the second case, the index scan > on sskjema matches more rows than it expects... > > Have a nice day, > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > > From each according to his ability. To each according to his ability to litigate. > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQFERiCRIB7bNG8LQkwRAq+MAJ4rhGLzU1sYszrT7DUWzPH2+bjVzwCfS1ne > 5y7A3WhI4PqfDaulFB2hPvc= > =N5EG > -----END PGP SIGNATURE----- > > >
On Wed, Apr 19, 2006 at 01:53:46PM +0200, Jonas Henriksen wrote: > Yes, explain analyze looks like this: Well, incorrect statistics are definitly the culprit, look: > " -> Index Scan using sskjema_pkey on sskjema s (cost=0.00..3868.95 rows=9738 width=157) (actual time=104.465..208.185rows=14417 loops=1)" 50% off, not bad. > " -> Index Scan using speciesix on tskjema t (cost=0.00..6.01 rows=1 width=75) (actual time=0.140..24.594 rows=1703loops=1)" > " Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))" > " Filter: (species ~~ 'TAGGMAKRELL%'::text)" Youch, 170200% percent off, which explains why it gets wildly bad timings. Can you increase the statistics on the species column? Check the docs for how. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Thanks, increasing the statistics did the trick: ALTER TABLE tskjema ALTER species SET STATISTICS 999 Is there a rule of thumb to tell how to set the statistics? Jonas:)) On 4/19/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Wed, Apr 19, 2006 at 01:53:46PM +0200, Jonas Henriksen wrote: > > Yes, explain analyze looks like this: > > Well, incorrect statistics are definitly the culprit, look: > > > > " -> Index Scan using sskjema_pkey on sskjema s (cost=0.00..3868.95 rows=9738 width=157) (actual time=104.465..208.185rows=14417 loops=1)" > > 50% off, not bad. > > > " -> Index Scan using speciesix on tskjema t (cost=0.00..6.01 rows=1 width=75) (actual time=0.140..24.594 rows=1703loops=1)" > > " Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))" > > " Filter: (species ~~ 'TAGGMAKRELL%'::text)" > > Youch, 170200% percent off, which explains why it gets wildly bad > timings. Can you increase the statistics on the species column? Check > the docs for how. > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > > From each according to his ability. To each according to his ability to litigate. > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQFERigoIB7bNG8LQkwRAhIdAJ9zFw0pkEqctLqHmbPteg7ofrDlEgCfS9ZO > IscpR2d+wW7fzW3OUM3QYvo= > =OewT > -----END PGP SIGNATURE----- > > >