Thread: Question on PostgreSQL DB behavior w.r.t JOIN and sort order.
<div class="WordSection1"><p class="MsoNormal">Hi folks,<p class="MsoNormal"> <p class="MsoNormal">I am seeing this behaviorchange in postgreSQL DB when compared to SQL Server DB when JOIN is performed. The sort order is not retained whenJOIN is performed in PostgreSQL DB.<p class="MsoNormal">Is it expected? Is there a solution available to retain the sortorder during JOIN? We have applications that expects the same sort order during JOIN and we want to support our applicationon PostgreSQL DB.<p class="MsoNormal">DO we need to indicate to the PostgreSQL DB optimizer to not change thesort order? If so, how to do it and what are it’s implications.<p class="MsoNormal"> <p class="MsoNormal">From the belowexample, you can see that the results are not in sorted order in PostgreSQL when compared to SQL Server DB.<p class="MsoNormal"> <pclass="MsoNormal"><b>SQLServer:</b><p class="MsoNormal"> <p class="MsoNormal" style="text-autospace:none"><spanstyle="font-size:10.0pt;font-family:"Courier New";color:blue">SELECT</span><span style="font-size:10.0pt;font-family:"CourierNew""> <span style="color:blue">top</span> 10 KH_<span style="color:gray">.</span>r_object_id<spanstyle="color:gray">,</span> KH_<span style="color:gray">.</span><span style="color:fuchsia">object_name</span><span style="color:blue">FROM</span> dbo<span style="color:gray">.</span>dm_location_s<span style="color:blue">AS</span> ZS_ <span style="color:gray">INNER</span> <spanstyle="color:gray"> JOIN</span></span><p class="MsoNormal" style="text-autospace:none"><span style="font-size:10.0pt;font-family:"CourierNew""> dbo<span style="color:gray">.</span>dm_sysobject_s<span style="color:blue">AS</span> KH_ <span style="color:blue">ON</span> ZS_<spanstyle="color:gray">.</span>r_object_id <span style="color:gray">=</span> KH_<span style="color:gray">.</span>r_object_id</span><p class="MsoNormal">3a00d5128000013f storage_01<p class="MsoNormal">3a00d51280000140 common<p class="MsoNormal">3a00d51280000141 events<p class="MsoNormal">3a00d51280000142 log<p class="MsoNormal">3a00d51280000143 config<p class="MsoNormal">3a00d51280000144 dm_dba<p class="MsoNormal">3a00d51280000145 auth_plugin<p class="MsoNormal">3a00d51280000146 ldapcertdb_loc<p class="MsoNormal">3a00d51280000147 temp<p class="MsoNormal">3a00d51280000148 dm_ca_store_fetch_location<p class="MsoNormal"> <p class="MsoNormal"><b>PostgreSQL:</b><pclass="MsoNormal"> <p class="MsoNormal">dm_repo6_docbase=> SELECT KH_.r_object_id,KH_.object_name FROM dm_location_s AS ZS_ INNER JOIN dm_sysobject_s AS KH_ ON ZS_.r_object_id = KH_.r_object_idlimit 10;<p class="MsoNormal"> <p class="MsoNormal"> r_object_id | object_name<p class="MsoNormal">------------------+---------------------------<pclass="MsoNormal">3a0003e98000a597 | TDfFXMigrateRMOPDQ71486_1<pclass="MsoNormal">3a0003e980007679 | 738296_2<p class="MsoNormal">3a0003e980000142 | log<p class="MsoNormal">3a0003e980000143| config<p class="MsoNormal">3a0003e980000140 | common<p class="MsoNormal">3a0003e98000013f| storage_01<p class="MsoNormal">3a0003e980000141 | events<p class="MsoNormal">3a0003e980000144| dm_dba<p class="MsoNormal">3a0003e980000145 | auth_plugin<p class="MsoNormal">3a0003e980000146| ldapcertdb_loc<p class="MsoNormal">(10 rows)<p class="MsoNormal"> <p class="MsoNormal">Thanks,<pclass="MsoNormal">Sekhar</div>
"Venkatesan, Sekhar" <sekhar.venkatesan@emc.com> writes: > I am seeing this behavior change in postgreSQL DB when compared to SQL Server DB when JOIN is performed. The sort orderis not retained when JOIN is performed in PostgreSQL DB. What sort order? You did not specify any ORDER BY clause, so the DBMS is entitled to return rows in any order it feels like. I do not know anything about this "top 10" modifier you've got in the SQL Server version, but I suspect it's implying a sort order. In Postgres, if you want a specific row ordering, you need to say ORDER BY. regards, tom lane
<div class="WordSection1"><p class="MsoPlainText">Hi Tom,<p class="MsoPlainText"> <p class="MsoPlainText">You can disregardthe "TOP 10" modifier. That was added by me to bring down the huge number of results being returned.<p class="MsoPlainText">Evenwithout the TOP modifier, SQL server is returning rows in sorted order (sorting columns based onthe r_object_id (1<sup>st</sup>) column I think) but PostgreSQL doesn’t.<p class="MsoPlainText">Is this anything to dowith indexes?<p class="MsoPlainText">So from what I understand, you say in postgres, if the sort order is not specified,postgres returns results in any order. Am I right?<p class="MsoPlainText"> <p class="MsoPlainText">-----OriginalMessage-----<br /> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] <br /> Sent: Tuesday, February09, 2016 10:30 AM<br /> To: Venkatesan, Sekhar<br /> Cc: pgsql-sql@postgresql.org<br /> Subject: Re: [SQL] Questionon PostgreSQL DB behavior w.r.t JOIN and sort order.<p class="MsoPlainText"> <p class="MsoPlainText">"Venkatesan,Sekhar" <<a href="mailto:sekhar.venkatesan@emc.com"><span style="color:windowtext;text-decoration:none">sekhar.venkatesan@emc.com</span></a>>writes:<p class="MsoPlainText">>I am seeing this behavior change in postgreSQL DB when compared to SQL Server DB when JOIN is performed.The sort order is not retained when JOIN is performed in PostgreSQL DB.<p class="MsoPlainText"> <p class="MsoPlainText">Whatsort order? You did not specify any ORDER BY clause, so the DBMS is entitled to return rows inany order it feels like.<p class="MsoPlainText"> <p class="MsoPlainText">I do not know anything about this "top 10" modifieryou've got in the SQL Server version, but I suspect it's implying a sort order. In Postgres, if you want a specificrow ordering, you need to say ORDER BY.<p class="MsoPlainText"> <p class="MsoPlainText"> regards, tom lane</div>
<br /><br /><div class="moz-cite-prefix">On 02/08/2016 10:21 PM, Venkatesan, Sekhar wrote:<br /></div><blockquote cite="mid:F84DE43FDACD4C45AA84E2DA016FAE2F1C65BEC8@MX105CL01.corp.emc.com"type="cite"><style><!-- /* Font Definitions */ @font-face{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal{margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:"Calibri","sans-serif";} a:link, span.MsoHyperlink{mso-style-priority:99;color:blue;text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed{mso-style-priority:99;color:purple;text-decoration:underline;} p.MsoPlainText, li.MsoPlainText, div.MsoPlainText{mso-style-priority:99;mso-style-link:"Plain Text Char";margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:"Calibri","sans-serif";} span.PlainTextChar{mso-style-name:"Plain Text Char";mso-style-priority:99;mso-style-link:"Plain Text";font-family:"Calibri","sans-serif";} .MsoChpDefault{mso-style-type:export-only;font-family:"Calibri","sans-serif";} @page WordSection1{size:8.5in 11.0in;margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1{page:WordSection1;} --></style><div class="WordSection1"><p class="MsoPlainText">Hi Tom,<p class="MsoPlainText"> <p class="MsoPlainText">Youcan disregard the "TOP 10" modifier. That was added by me to bring down the huge number of resultsbeing returned.<p class="MsoPlainText">Even without the TOP modifier, SQL server is returning rows in sorted order(sorting columns based on the r_object_id (1<sup>st</sup>) column I think) but PostgreSQL doesn’t.<p class="MsoPlainText">Isthis anything to do with indexes?<p class="MsoPlainText">So from what I understand, you say in postgres,if the sort order is not specified, postgres returns results in any order. Am I right?<p class="MsoPlainText"> <pclass="MsoPlainText">-----Original Message-----<br /> From: Tom Lane [<a class="moz-txt-link-freetext"href="mailto:tgl@sss.pgh.pa.us">mailto:tgl@sss.pgh.pa.us</a>] <br /> Sent: Tuesday, February09, 2016 10:30 AM<br /> To: Venkatesan, Sekhar<br /> Cc: <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br/> Subject: Re: [SQL] Question on PostgreSQL DB behaviorw.r.t JOIN and sort order.<p class="MsoPlainText"> <p class="MsoPlainText">"Venkatesan, Sekhar" <<a href="mailto:sekhar.venkatesan@emc.com"moz-do-not-send="true"><span style="color:windowtext;text-decoration:none"><a class="moz-txt-link-abbreviated"href="mailto:sekhar.venkatesan@emc.com">sekhar.venkatesan@emc.com</a></span></a>> writes:<pclass="MsoPlainText">> I am seeing this behavior change in postgreSQL DB when compared to SQL Server DB whenJOIN is performed. The sort order is not retained when JOIN is performed in PostgreSQL DB.<p class="MsoPlainText"> <pclass="MsoPlainText">What sort order? You did not specify any ORDER BY clause, so the DBMS is entitledto return rows in any order it feels like.<p class="MsoPlainText"> <p class="MsoPlainText">I do not know anythingabout this "top 10" modifier you've got in the SQL Server version, but I suspect it's implying a sort order. InPostgres, if you want a specific row ordering, you need to say ORDER BY.<p class="MsoPlainText"> <p class="MsoPlainText"> regards, tom lane</div></blockquote> In my experience,this is ofter termed "disc order", implying what ever order the resultant tuples were discovered while processingthe data. If MSSQL server is giving an order without explicit instruction to do so you may be incurring an unwantedsort operation. Is (any of) the data in a "clustered index": iirc that implies an on-disc ordering and the resultset my be reflecting that.<br /><br />
So from what I understand, you say in postgres, if the sort order is not specified, postgres returns results in any order. Am I right?
Is there a way to tell the optimizer to retain the sort order if that is possible please?
From: David G. Johnston [mailto:david.g.johnston@gmail.com]
Sent: Tuesday, February 09, 2016 11:17 AM
To: Venkatesan, Sekhar
Cc: Tom Lane; pgsql-sql@postgresql.org
Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order.
On Monday, February 8, 2016, Venkatesan, Sekhar <sekhar.venkatesan@emc.com> wrote:
So from what I understand, you say in postgres, if the sort order is not specified, postgres returns results in any order. Am I right?
Yes. It will optimize for speed without any regard for maintaining any kind of ordering. You may get the desired order for various reasons but without ORDER BY you cannot be guaranteed.
David J.
Is there a way to tell the optimizer to retain the sort order if that is possible please?
Yes. is there an option/configuration to tell the postgres optimizer/planner to generate plans to include the sort order instead of speed?
From: David G. Johnston [mailto:david.g.johnston@gmail.com]
Sent: Tuesday, February 09, 2016 11:20 AM
To: Venkatesan, Sekhar
Cc: Tom Lane; pgsql-sql@postgresql.org
Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order.
On Monday, February 8, 2016, Venkatesan, Sekhar <sekhar.venkatesan@emc.com> wrote:
Is there a way to tell the optimizer to retain the sort order if that is possible please?
You mean, besides the ORDER BY clause?
David J.
Which order would that be?Yes. is there an option/configuration to tell the postgres optimizer/planner to generate plans to include the sort order instead of speed?
From: David G. Johnston [mailto:david.g.johnston@gmail.com]
Sent: Tuesday, February 09, 2016 11:20 AM
To: Venkatesan, Sekhar
Cc: Tom Lane; pgsql-sql@postgresql.org
Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order.
On Monday, February 8, 2016, Venkatesan, Sekhar <sekhar.venkatesan@emc.com> wrote:
Is there a way to tell the optimizer to retain the sort order if that is possible please?
You mean, besides the ORDER BY clause?
David J.
Yes. is there an option/configuration to tell the postgres optimizer/planner to generate plans to include the sort order instead of speed?
On 02/08/2016 09:53 PM, Venkatesan, Sekhar wrote: > Yes. is there an option/configuration to tell the postgres > optimizer/planner to generate plans to include the sort order instead of > speed? What columns in a table would that be and then what order? > > *From:*David G. Johnston [mailto:david.g.johnston@gmail.com] > *Sent:* Tuesday, February 09, 2016 11:20 AM > *To:* Venkatesan, Sekhar > *Cc:* Tom Lane; pgsql-sql@postgresql.org > *Subject:* Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and > sort order. > > On Monday, February 8, 2016, Venkatesan, Sekhar > <sekhar.venkatesan@emc.com <mailto:sekhar.venkatesan@emc.com>> wrote: > > Is there a way to tell the optimizer to retain the sort order if that is > possible please? > > You mean, besides the ORDER BY clause? > > David J. > -- Adrian Klaver adrian.klaver@aklaver.com
On 02/08/2016 10:53 PM, Venkatesan, Sekhar wrote:Which order would that be?Yes. is there an option/configuration to tell the postgres optimizer/planner to generate plans to include the sort order instead of speed?
From: David G. Johnston [mailto:david.g.johnston@gmail.com]
Sent: Tuesday, February 09, 2016 11:20 AM
To: Venkatesan, Sekhar
Cc: Tom Lane; pgsql-sql@postgresql.org
Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order.
On Monday, February 8, 2016, Venkatesan, Sekhar <sekhar.venkatesan@emc.com> wrote:
Is there a way to tell the optimizer to retain the sort order if that is possible please?
You mean, besides the ORDER BY clause?
David J.
My concern here is that I want to maintain consistency ( in our application to retain sort order) between different databases. I don't see the issue in SQL Server and Oracle databases. "SELECT KH_.r_object_id, KH_.object_name FROM dbo.dm_location_s AS ZS_ INNER JOIN dbo.dm_sysobject_sAS KH_ ON ZS_.r_object_id = KH_.r_object_id " The above query is sorted based on the first column in the select list. Same is not happening in PostgreSQL. Is this something to do with collation setting in database? -----Original Message----- From: Adrian Klaver [mailto:adrian.klaver@aklaver.com] Sent: Tuesday, February 09, 2016 11:32 AM To: Venkatesan, Sekhar; David G. Johnston Cc: Tom Lane; pgsql-sql@postgresql.org Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order. On 02/08/2016 09:53 PM, Venkatesan, Sekhar wrote: > Yes. is there an option/configuration to tell the postgres > optimizer/planner to generate plans to include the sort order instead > of speed? What columns in a table would that be and then what order? > > *From:*David G. Johnston [mailto:david.g.johnston@gmail.com] > *Sent:* Tuesday, February 09, 2016 11:20 AM > *To:* Venkatesan, Sekhar > *Cc:* Tom Lane; pgsql-sql@postgresql.org > *Subject:* Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and > sort order. > > On Monday, February 8, 2016, Venkatesan, Sekhar > <sekhar.venkatesan@emc.com <mailto:sekhar.venkatesan@emc.com>> wrote: > > Is there a way to tell the optimizer to retain the sort order if that > is possible please? > > You mean, besides the ORDER BY clause? > > David J. > -- Adrian Klaver adrian.klaver@aklaver.com
My concern here is that I want to maintain consistency ( in our application to retain sort order) between different databases.
I don't see the issue in SQL Server and Oracle databases.
"SELECT KH_.r_object_id, KH_.object_name FROM dbo.dm_location_s AS ZS_ INNER JOIN
dbo.dm_sysobject_s AS KH_ ON ZS_.r_object_id = KH_.r_object_id "
The above query is sorted based on the first column in the select list. Same is not happening in PostgreSQL.
Is this something to do with collation setting in database?
Venkatesan, Sekhar schrieb am 09.02.2016 um 06:21: > So from what I understand, you say in postgres, if the sort order is not specified, > postgres returns results in any order. Am I right? This is nothing Postgres specific. This is true for *every* DBMS. Without an order by, the DBMS is free to return the rows in any order it wants. If you have seen a specific order in SQL Server that was pure coincidence and can *not* be relied upon.
Actually, the behavior you've seen in SQL Server may be a pure artifact of the table structures underneath your queries. Most database architects will (appropriately) put a primary key on every table and the default in SQL Server is to make primarykeys clustered...and clustering arranges the physical storage of the rows in the increasing order of that key. Ifthat is what exists in the SQL Server db, then KH_.r_object_id would be a clustered PK and so of course would return rowsin that order, automatically. As Kellerer said, otherwise it is random ordering, without an Order By clause. Postgres PKs are not clustered by default, so you'll experience the random row ordering you mentioned. Cluster that columnand you'll get the same behavior...but read up on postgres clustering since it works very differently than SQL Server. Mike -----Original Message----- From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Venkatesan, Sekhar Sent: Monday, February 08, 2016 10:15 PM To: Adrian Klaver <adrian.klaver@aklaver.com>; David G. Johnston <david.g.johnston@gmail.com> Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-sql@postgresql.org Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order. My concern here is that I want to maintain consistency ( in our application to retain sort order) between different databases. I don't see the issue in SQL Server and Oracle databases. "SELECT KH_.r_object_id, KH_.object_name FROM dbo.dm_location_s AS ZS_ INNER JOIN dbo.dm_sysobject_sAS KH_ ON ZS_.r_object_id = KH_.r_object_id " The above query is sorted based on the first column in the select list. Same is not happening in PostgreSQL. Is this something to do with collation setting in database? -----Original Message----- From: Adrian Klaver [mailto:adrian.klaver@aklaver.com] Sent: Tuesday, February 09, 2016 11:32 AM To: Venkatesan, Sekhar; David G. Johnston Cc: Tom Lane; pgsql-sql@postgresql.org Subject: Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and sort order. On 02/08/2016 09:53 PM, Venkatesan, Sekhar wrote: > Yes. is there an option/configuration to tell the postgres > optimizer/planner to generate plans to include the sort order instead > of speed? What columns in a table would that be and then what order? > > *From:*David G. Johnston [mailto:david.g.johnston@gmail.com] > *Sent:* Tuesday, February 09, 2016 11:20 AM > *To:* Venkatesan, Sekhar > *Cc:* Tom Lane; pgsql-sql@postgresql.org > *Subject:* Re: [SQL] Question on PostgreSQL DB behavior w.r.t JOIN and > sort order. > > On Monday, February 8, 2016, Venkatesan, Sekhar > <sekhar.venkatesan@emc.com <mailto:sekhar.venkatesan@emc.com>> wrote: > > Is there a way to tell the optimizer to retain the sort order if that > is possible please? > > You mean, besides the ORDER BY clause? > > David J. > -- Adrian Klaver adrian.klaver@aklaver.com -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Sekhar,
You will have to specify a sort order with "order by <field>" clause before the limit clause. It's the only way I know the order to be guaranteed to remain the same. Hope this helps.
Stuart
Hi folks,
I am seeing this behavior change in postgreSQL DB when compared to SQL Server DB when JOIN is performed. The sort order is not retained when JOIN is performed in PostgreSQL DB.
Is it expected? Is there a solution available to retain the sort order during JOIN? We have applications that expects the same sort order during JOIN and we want to support our application on PostgreSQL DB.
DO we need to indicate to the PostgreSQL DB optimizer to not change the sort order? If so, how to do it and what are it’s implications.
From the below example, you can see that the results are not in sorted order in PostgreSQL when compared to SQL Server DB.
SQLServer:
SELECT top 10 KH_.r_object_id, KH_.object_name FROM dbo.dm_location_s AS ZS_ INNER JOIN
dbo.dm_sysobject_s AS KH_ ON ZS_.r_object_id = KH_.r_object_id
3a00d5128000013f storage_01
3a00d51280000140 common
3a00d51280000141 events
3a00d51280000142 log
3a00d51280000143 config
3a00d51280000144 dm_dba
3a00d51280000145 auth_plugin
3a00d51280000146 ldapcertdb_loc
3a00d51280000147 temp
3a00d51280000148 dm_ca_store_fetch_location
PostgreSQL:
dm_repo6_docbase=> SELECT KH_.r_object_id, KH_.object_name FROM dm_location_s AS ZS_ INNER JOIN dm_sysobject_s AS KH_ ON ZS_.r_object_id = KH_.r_object_id limit 10;
r_object_id | object_name
------------------+---------------------------
3a0003e98000a597 | TDfFXMigrateRMOPDQ71486_1
3a0003e980007679 | 738296_2
3a0003e980000142 | log
3a0003e980000143 | config
3a0003e980000140 | common
3a0003e98000013f | storage_01
3a0003e980000141 | events
3a0003e980000144 | dm_dba
3a0003e980000145 | auth_plugin
3a0003e980000146 | ldapcertdb_loc
(10 rows)
Thanks,
Sekhar