Home > mailing lists

Re: analyzing intermediate query - Mailing list pgsql-performance

From	Scott Carey
Subject	Re: analyzing intermediate query
Date	December 2, 2008 20:22:06
Msg-id	BDFBB77C9E07BE4A984DAAE981D19F961ACA2FE1B4@EXVMBX018-1.exch018.msoutlookonline.net Whole thread Raw
In response to	Re: analyzing intermediate query ("Andrus" <kobruleht2@hot.ee>)
List	pgsql-performance

Tree view

Have you tried running ANALYZE on the temp table before selecting out of it?  That should give it the statistics
necessaryto correctly guess the output of a group by on a single column. 

ANALYZE temptbl;
SELECT dok.*
 FROM dok
JOIN  (SELECT dokumnr FROM  temptbl GROUP BY dokumnr ) x USING(dokumnr);

-----Original Message-----
From: Andrus [mailto:kobruleht2@hot.ee]
Sent: Tuesday, December 02, 2008 10:59 AM
To: Scott Carey; pgsql-performance@postgresql.org; PFC
Subject: Re: [PERFORM] analyzing intermediate query

Scott,

>Often times, switching an inner subselect that requires a distinct to a
>group by on that column yields better results.  In this case, the IN should
>be equivalent, so it probably will not help.  This would look like:

SELECT dok.*
 FROM dok
JOIN  (SELECT dokumnr FROM  temptbl GROUP BY dokumnr ) x USING(dokumnr);

Thank you. This may be great idea.
I changed my query to use GROUP BY instead of DISTINCT

>Whether that hepls depends on how big dokumnr is and where the query
>bottleneck is.

I'm wondering how this can solve the issue when there is single or few
dokumnr columns.
Planner still thinks that temptbl projection contains 1000 rows and uses seq
scan instead of using bitmap index on dok table.

I tried

SELECT dok.*
 FROM dok
JOIN  (SELECT dokumnr FROM  temptbl GROUP BY dokumnr ANALYZE ) x
USING(dokumnr);

but got error.

> Note there are subtle differences between DISTINCT and GROUP BY with
> respect to nulls.

dokumnr is int type and is not null always.

Andrus.

pgsql-performance by date:

From: "Andrus"
Date: 02 December 2008, 17:59:03
Subject: Re: analyzing intermediate query

From: Matthew Wakeling
Date: 03 December 2008, 15:01:56
Subject: Postgres using more memory than it should

Re: analyzing intermediate query - Mailing list pgsql-performance

Previous

Next