Re: BUG #13112: Catastrophic performance degradation without DISTINCT ON statement - Mailing list pgsql-bugs

From David G. Johnston
Subject Re: BUG #13112: Catastrophic performance degradation without DISTINCT ON statement
Date
Msg-id CAKFQuwaCMGeeLQzikV2F55dec4Wf9EAZrDaEAx-O48XpiAQnrg@mail.gmail.com
Whole thread Raw
In response to BUG #13112: Catastrophic performance degradation without DISTINCT ON statement  (oyvind.harboe@zylin.com)
List pgsql-bugs
On Tue, Apr 21, 2015 at 12:16 AM, <oyvind.harboe@zylin.com> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      13112
> Logged by:          =C3=98yvind Harboe
> Email address:      oyvind.harboe@zylin.com
> PostgreSQL version: 9.1.0
> Operating system:   Ubuntu
> Description:
>
> I've been testing out PostgreSQL vs. Derby/MS SQL for our application whe=
n
> I
> ran into a problem where the performance of PostgreSQL went from great to
> abysmal for no apparent reason.
>
> After a bit of digging, I've found that the problem is with the SQL
> statement that Apache Cayenne generates.
>
> Apache Cayenne generates statements of the following form which yields ba=
d
> performance on PostgreSQL with complicated WHERE statements and numerous
> columns:
>
> 1) SELECT DISTINCT a,b,c,d,e,f ... WHERE somecomplicatedstatement
>
> If I rewrite this statement to the form below using the 'DISTINCT ON()'
> syntax(which is PostgreSQL specific dialect), then I get great performanc=
e
> again:
>
> 2) SELECT DISTINCT ON(a) a,b,c,d,e,f ... WHERE somecomplicatedstatement
>
> Numbers on my machine:
>
> 1) 44000ms
>
> 2) 4300ms
>
> Here's where I read up on the DISTINCT ON syntax:
> http://www.postgresql.org/docs/9.4/static/sql-select.html


So what - the two queries are not equivalent.

There may be room for improvement here but you've given insufficient
information to help anyone who cares to dig deeper.

see =E2=80=8Bhttps://wiki.postgresql.org/wiki/SlowQueryQuestions
=E2=80=8B
Distinct requires sorting - the ON clause just limits how much - so using
it in place of writing a correct query will always yield sub-optimal
results.

This is not a bug even if PostgreSQL could be taught to handle these kinds
of ORM queries more effectively.

David J.

pgsql-bugs by date:

Previous
From: oyvind.harboe@zylin.com
Date:
Subject: BUG #13112: Catastrophic performance degradation without DISTINCT ON statement
Next
From: Kevin Grittner
Date:
Subject: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)