Fwd: Joins and links - Mailing list pgsql-hackers
From | Leon |
---|---|
Subject | Fwd: Joins and links |
Date | |
Msg-id | 6850.990705@udmnet.ru Whole thread Raw |
Responses |
Re: [HACKERS] Fwd: Joins and links
|
List | pgsql-hackers |
Hello hackers! I posted this message to general mailing list, and was told that hackers' list is more appropriate place to post this message to. What will you say about it? This is a forwarded message From: Leon <leon@udmnet.ru> To: pgsql-general <pgsql-general@postgreSQL.org> Date: Monday, July 05, 1999, 5:46:36 PM Subject: Joins and links ===8<==============Original message text=============== Hello all! You probably remember me - recently I complained about speedof joins in Postgres. After a short investigation the way wasfoundin which Postgres's optimizer can do the right job. Itwas constructive discussion. Now I want to tell you what couldmakePostgres better and faster. And what will make us (ourdevelopment group) happy. Maybe I am bothering someone, ifIdo - tell me that. Let me begin. First of all, some accounting reports need to be deliveredvery fast - within minutes or so. And what's bad is thatquite afew of these reports are quite time-consuming and searchintensive. In particular, internals of these reports includea lotof joins on tables. Secondly, almost all of accounting information naturallyfits into network data model, which can be implemented veryefficiently. This stuff described here is not accounting-specific, itcan be found in every database which uses master-detailtables andother such types of relations. So. How is join being performed in such cases? Although I amnot an expert, I can imagine the way: first there is an (index)scanon first table, and then an (index) scan on the second.It is the best way, reality could be much worse as we haveseen. How can we radically improve performance in such cases? Thereis a simple and quite obvious way. (For you not to think thatIam hallucinating I will tell you that there exist somereal servers that offer such features I am talking about)We shouldmake a real reference in one table to another! Thatmeans there could be special data type called, say, "link",whichis a physical record number in the foreign table. Queries could look like this: table1:a int4b link (->table2) table2:c int4recnum (system auxiliary field, really a record number in the table) select * from table2 where table1.a > 5 and table1.b = table2.recnum Such joins can fly really fast, as practice shows :)Just consider: the thing table1.b = table2.recnum is a READY-MADEjoin,so server doesn't have to build anything on top of that. Itcan simply perform lookup through link, and sinceit is a physicalrecord number, this is done with the efficiency of C pointers! Thusperformance gain is ENORMOUS. And it simplifies the optimizer, because it doesn't have to decideanything about whether to use indices and such like. Thejoin isperformed always the same way, and it is the best way. This feature, being implemented, could bring Postgres aheadof most commercial servers, so proving creative abilities offreesoftware community. Let us make a step in the future! Best regards,Leon ===8<===========End of original message text=========== Best regards,Leon mailto:leon@udmnet.ru
pgsql-hackers by date: