Fwd: Joins and links - Mailing list pgsql-hackers

From Leon
Subject Fwd: Joins and links
Date
Msg-id 6850.990705@udmnet.ru
Whole thread Raw
Responses Re: [HACKERS] Fwd: Joins and links
List pgsql-hackers
Hello hackers!

I posted this message to general mailing list, and was told
that hackers' list is more appropriate place to post this
message to. What will you say about it?


This is a forwarded message
From: Leon <leon@udmnet.ru>
To: pgsql-general <pgsql-general@postgreSQL.org>
Date: Monday, July 05, 1999, 5:46:36 PM
Subject: Joins and links

===8<==============Original message text===============
Hello all!
You probably remember me - recently I complained about speedof joins in Postgres. After a short investigation the way
wasfoundin which Postgres's optimizer can do the right job. Itwas constructive discussion. Now I want to tell you what
couldmakePostgres better and faster. And what will make us (ourdevelopment group) happy. Maybe I am bothering someone,
ifIdo - tell me that.
 
Let me begin.
First of all, some accounting reports need to be deliveredvery fast - within minutes or so. And what's bad is thatquite
afew of these reports are quite time-consuming and searchintensive. In particular, internals of these reports includea
lotof joins on tables.
 
Secondly, almost all of accounting information naturallyfits into network data model, which can be implemented
veryefficiently.
This stuff described here is not accounting-specific, itcan be found in every database which uses master-detailtables
andother such types of relations.
 
So. How is join being performed in such cases? Although I amnot an expert, I can imagine the way: first there is an
(index)scanon first table, and then an (index) scan on the second.It is the best way, reality could be much worse as we
haveseen.
 
How can we radically improve performance in such cases? Thereis a simple and quite obvious way. (For you not to think
thatIam hallucinating I will tell you that there exist somereal servers that offer such features I am talking about)We
shouldmake a real reference in one table to another! Thatmeans there could be special data type called, say,
"link",whichis a physical record number in the foreign table.
 
Queries could look like this:
table1:a int4b link (->table2)
table2:c int4recnum (system auxiliary field, really a record number in the table)
select * from table2 where table1.a > 5 and table1.b = table2.recnum
Such joins can fly really fast, as practice shows :)Just consider: the thing table1.b = table2.recnum is a
READY-MADEjoin,so server doesn't have to build anything on top of that. Itcan simply perform lookup through link, and
sinceit is a physicalrecord number, this is done with the efficiency of C pointers! Thusperformance gain is ENORMOUS.
 
And it simplifies the optimizer, because it doesn't have to decideanything about whether to use indices and such like.
Thejoin isperformed always the same way, and it is the best way.
 
This feature, being implemented, could bring Postgres aheadof most commercial servers, so proving creative abilities
offreesoftware community. Let us make a step in the future!
 

Best regards,Leon

===8<===========End of original message text===========


Best regards,Leon                            mailto:leon@udmnet.ru




pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: [Fwd: sgml tool]
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Fwd: Joins and links