Thread: strange query filter problems

strange query filter problems

From
"Jonas Henriksen"
Date:
Hi,
I have a problem with a slow query. (I have run vacuum full analyze!)
It seems that the query below works OK because the query planner
filters on the date first. It takes about 0.3 sec:
EXPLAIN
SELECT
*
FROM sskjema s INNER JOIN tskjema t USING(sskjema_pkey)
WHERE t.species::char(12) LIKE 'TAGGMAKRELL%'::char(12)
AND s.date >=20050101

"Merge Join  (cost=6.02..3899.33 rows=1 width=228)"
"  Merge Cond: ("outer".sskjema_pkey = "inner".sskjema_pkey)"
"  ->  Index Scan using sskjema_pkey on sskjema s  (cost=0.00..3868.95
rows=9738 width=157)"
"        Filter: (date >= 20050101)"
"  ->  Sort  (cost=6.02..6.03 rows=1 width=75)"
"        Sort Key: t.sskjema_pkey"
"        ->  Index Scan using speciesix on tskjema t  (cost=0.00..6.01
rows=1 width=75)"
"              Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND
(species ~<~ 'TAGGMAKRELM'::bpchar))"
"              Filter: (species ~~ 'TAGGMAKRELL%'::text)"

However, if I add a second date-condition to further cut the
result-size, the species-column is used as the first filter, drasticly
increasing the query-time to more than a minute:
EXPLAIN
SELECT
*
FROM sskjema s INNER JOIN tskjema t USING(sskjema_pkey)
WHERE t.species::char(12) LIKE 'TAGGMAKRELL%'::char(12)
AND s.date >=20050101
AND s.date <=20051231

"Nested Loop  (cost=0.00..4049.18 rows=1 width=228)"
"  Join Filter: ("inner".sskjema_pkey = "outer".sskjema_pkey)"
"  ->  Index Scan using speciesix on tskjema t  (cost=0.00..6.01
rows=1 width=75)"
"        Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND
(species ~<~ 'TAGGMAKRELM'::bpchar))"
"        Filter: (species ~~ 'TAGGMAKRELL%'::text)"
"  ->  Index Scan using dateix on sskjema s  (cost=0.00..4025.13
rows=1443 width=157)"
"        Index Cond: ((date >= 20050101) AND (date <= 20051231))"

Any suggestions how to get the planner do the query in the best way?

regards Jonas:))

Re: strange query filter problems

From
Martijn van Oosterhout
Date:
On Wed, Apr 19, 2006 at 01:27:45PM +0200, Jonas Henriksen wrote:
> Hi,
> I have a problem with a slow query. (I have run vacuum full analyze!)
> It seems that the query below works OK because the query planner
> filters on the date first. It takes about 0.3 sec:

Can we see an EXPLAIN ANALYZE of the two queries, so we can see what's
actually happening? My guess is that in the second case, the index scan
on sskjema matches more rows than it expects...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: strange query filter problems

From
"Jonas Henriksen"
Date:
Yes, explain analyze looks like this:

EXPLAIN
ANALYZE
SELECT
*
FROM sskjema s inner join tskjema t using(sskjema_pkey) where
t.species::char(12) like 'TAGGMAKRELL%'::char(12)
and s.date >=20050101

"Merge Join  (cost=6.02..3899.33 rows=1 width=228) (actual
time=150.274..331.782 rows=190 loops=1)"
"  Merge Cond: ("outer".sskjema_pkey = "inner".sskjema_pkey)"
"  ->  Index Scan using sskjema_pkey on sskjema s  (cost=0.00..3868.95
rows=9738 width=157) (actual time=104.465..208.185 rows=14417
loops=1)"

"        Filter: (date >= 20050101)"
"  ->  Sort  (cost=6.02..6.03 rows=1 width=75) (actual
time=34.693..40.956 rows=1703 loops=1)"
"        Sort Key: t.sskjema_pkey"
"        ->  Index Scan using speciesix on tskjema t  (cost=0.00..6.01
rows=1 width=75) (actual time=0.140..24.594 rows=1703 loops=1)"

"              Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND
(species ~<~ 'TAGGMAKRELM'::bpchar))"
"              Filter: (species ~~ 'TAGGMAKRELL%'::text)"
"Total runtime: 333.158 ms"



EXPLAIN
ANALYZE
SELECT
*
FROM sskjema s inner join tskjema t using(sskjema_pkey) where
t.species::char(12) like 'TAGGMAKRELL%'::char(12)
and s.date >=20050101
and s.date <=20051231

"Nested Loop  (cost=0.00..4049.18 rows=1 width=228) (actual
time=1260.988..252110.934 rows=150 loops=1)"
"  Join Filter: ("inner".sskjema_pkey = "outer".sskjema_pkey)"
"  ->  Index Scan using speciesix on tskjema t  (cost=0.00..6.01
rows=1 width=75) (actual time=0.256..50.875 rows=1703 loops=1)"

"        Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND
(species ~<~ 'TAGGMAKRELM'::bpchar))"
"        Filter: (species ~~ 'TAGGMAKRELL%'::text)"
"  ->  Index Scan using dateix on sskjema s  (cost=0.00..4025.13
rows=1443 width=157) (actual time=0.026..76.451 rows=14340
loops=1703)"

"        Index Cond: ((date >= 20050101) AND (date <= 20051231))"
"Total runtime: 252111.940 ms"


Jonas:))

On 4/19/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Wed, Apr 19, 2006 at 01:27:45PM +0200, Jonas Henriksen wrote:
> > Hi,
> > I have a problem with a slow query. (I have run vacuum full analyze!)
> > It seems that the query below works OK because the query planner
> > filters on the date first. It takes about 0.3 sec:
>
> Can we see an EXPLAIN ANALYZE of the two queries, so we can see what's
> actually happening? My guess is that in the second case, the index scan
> on sskjema matches more rows than it expects...
>
> Have a nice day,
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
>
> iD8DBQFERiCRIB7bNG8LQkwRAq+MAJ4rhGLzU1sYszrT7DUWzPH2+bjVzwCfS1ne
> 5y7A3WhI4PqfDaulFB2hPvc=
> =N5EG
> -----END PGP SIGNATURE-----
>
>
>

Re: strange query filter problems

From
Martijn van Oosterhout
Date:
On Wed, Apr 19, 2006 at 01:53:46PM +0200, Jonas Henriksen wrote:
> Yes, explain analyze looks like this:

Well, incorrect statistics are definitly the culprit, look:


> "  ->  Index Scan using sskjema_pkey on sskjema s  (cost=0.00..3868.95 rows=9738 width=157) (actual
time=104.465..208.185rows=14417 loops=1)" 

50% off, not bad.

> "        ->  Index Scan using speciesix on tskjema t  (cost=0.00..6.01 rows=1 width=75) (actual time=0.140..24.594
rows=1703loops=1)" 
> "              Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))"
> "              Filter: (species ~~ 'TAGGMAKRELL%'::text)"

Youch, 170200% percent off, which explains why it gets wildly bad
timings. Can you increase the statistics on the species column? Check
the docs for how.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: strange query filter problems

From
"Jonas Henriksen"
Date:
Thanks, increasing the statistics did the trick:
ALTER TABLE tskjema ALTER  species SET STATISTICS 999

Is there a rule of thumb to tell how to set the statistics?

Jonas:))


On 4/19/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Wed, Apr 19, 2006 at 01:53:46PM +0200, Jonas Henriksen wrote:
> > Yes, explain analyze looks like this:
>
> Well, incorrect statistics are definitly the culprit, look:
>
>
> > "  ->  Index Scan using sskjema_pkey on sskjema s  (cost=0.00..3868.95 rows=9738 width=157) (actual
time=104.465..208.185rows=14417 loops=1)" 
>
> 50% off, not bad.
>
> > "        ->  Index Scan using speciesix on tskjema t  (cost=0.00..6.01 rows=1 width=75) (actual time=0.140..24.594
rows=1703loops=1)" 
> > "              Index Cond: ((species ~>=~ 'TAGGMAKRELL'::bpchar) AND (species ~<~ 'TAGGMAKRELM'::bpchar))"
> > "              Filter: (species ~~ 'TAGGMAKRELL%'::text)"
>
> Youch, 170200% percent off, which explains why it gets wildly bad
> timings. Can you increase the statistics on the species column? Check
> the docs for how.
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
>
> iD8DBQFERigoIB7bNG8LQkwRAhIdAJ9zFw0pkEqctLqHmbPteg7ofrDlEgCfS9ZO
> IscpR2d+wW7fzW3OUM3QYvo=
> =OewT
> -----END PGP SIGNATURE-----
>
>
>