Often times, switching an inner subselect that requires a distinct to a group by on that column yields better results.
Inthis case, the IN should be equivalent, so it probably will not help. This would look like:
SELECT dok.*
FROM dok
JOIN (SELECT dokumnr FROM temptbl GROUP BY dokumnr ) x USING(dokumnr);
Whether that hepls depends on how big dokumnr is and where the query bottleneck is. Note there are subtle differences
betweenDISTINCT and GROUP BY with respect to nulls.
________________________________________
From: pgsql-performance-owner@postgresql.org [pgsql-performance-owner@postgresql.org] On Behalf Of Andrus
[kobruleht2@hot.ee]
Sent: Tuesday, December 02, 2008 7:50 AM
To: pgsql-performance@postgresql.org; PFC
Subject: Re: [PERFORM] analyzing intermediate query
> Oh, I just thought about something, I don't remember in which version it
> was added, but :
>
> EXPLAIN ANALYZE SELECT sum(column1) FROM (VALUES ...a million
> ntegers... ) AS v
>
> Postgres is perfectly happy with that ; it's either a bit slow (about 1
> second) or very fast depending on how you view things...
I tried in 8.1.4
select * from (values (0)) xx
but got
ERROR: syntax error at or near ")"
SQL state: 42601
Character: 26
Even if this works this may be not solution: I need to apply distinct to
temporary table. Temporary table may contain duplicate values and without
DISTINCT join produces invalid result.
Temporary table itself is created from data from server tables, it is not
generated from list.
I can use
SELECT dok.*
FROM dok
WHERE dokumnr IN (SELECT dokumnr FROM temptbl)
but this seems never use bitmap index scan in 8.1.4
Sadly, creating second temporary table from first temporary table specially
for this query seems to be only solution.
When materialized row count will be added so that statistics is exact and
select count(*) from tbl runs fast ?
Andrus.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance