Home > mailing lists

Tuning queries and distinct behaviour - Mailing list pgsql-general

From	Gordan Bobic
Subject	Tuning queries and distinct behaviour
Date	January 18, 2001 11:07:53
Msg-id	002f01c08167$de6d6620$8000000a@localdomain Whole thread Raw
In response to	Re: Re: MySQL file system (Marc SCHAEFER <schaefer@alphanet.ch>)
Responses	Re: Tuning queries and distinct behaviour Re: Tuning queries and distinct behaviour
List	pgsql-general

Tree view

Is there a way to tune queries?

I'm doing queries that join around 5-6 tables. All join fields are indexed
either in hash (where tables are small enough and join is done on "="), or
btree (big tables, not joined on "="). The tables have between several
hundred and several tens of millions of records. The problem is that this
tends to take a _VERY_ long time. I know that I'm asking for a bit much on
such a huge task, but if I break the queries down manually into 10 simper
ones that I could run manually with temporary tables, each of those would
take a few seconds at most. The optimizer occasionally decides to do
sequential scans, and this is probably what is killing the performance. Is
there any way to give the optimizer hints? I suspect that it would help in
a vast majority of cases if it looked ad the where clauses in views and
selects on views and tried cutting down the working set through that, and
then pruning down the rest as it went along. It just seems a bit strange
that doing subqueries with temporary tables should be so much faster.

Doing VACUUM ANALYZE often helps, but not always. Is there any way to give
the optimizer hints on how to speed up selects on views that do big joins,
both inner and outer?

Another question - I have to do a join on the "<" operation. Something
like:

SELECT * FROM Table1, Table2 WHERE Table1.Field1 < Table2.Field2 ORDER BY
Table1.Field1 DESC, Table2.Field2 DESC;

This will give me a very large set of records. However, I only really need
a few of those records. I only want the highest Field1, Field2 combination
records for some other field in Table1. Effectively, something like:

CREATE VIEW SomeView AS SELECT * FROM Table1, Table2 WHERE Table1.Field1 <
Table2.Field2 ORDER BY Table1.Field1 DESC, Table2.Field2 DESC;

and then doing:

SELECT DISTINCT ON (Table1.Field3) * FROM SomeView. I would hope that
DISTINCT would pick the first record returned for each of the different
occurences in SomeView.

Unfortunately it doesn't. By having a quick scan at the returned records,
it doesn't seem to pick them in any particular order. Is this the case? And
is there an SQL standard that says which records should DISTINCT return
(first, last, or arbitrary)? And is there a way to enforce this behaviour,
just in case the default behaviour changes in the future?

Alternatively, can anyone think of a solution to this problem?

Thanks.

Gordan

pgsql-general by date:

From: "Martin A. Marques"
Date: 18 January 2001, 10:17:09
Subject: compilation error

From: Travis Bauer
Date: 18 January 2001, 11:17:31
Subject: Re: plperl and sqrt.

Tuning queries and distinct behaviour - Mailing list pgsql-general

Previous

Next