Thread: Bidirectional hard joins (fwd)
Tom, I attached a message from my colleague and think it'd be interesting to you. A short history: During developing of one project on Windows platform, Teodor has discovered a pretty nice feature of Gigabase (free embedded database by Konstantin Knizhnik, http://www.geocities.com/kknizhnik/gigabase.html), which helps us a lot. Ivan has wrote a proposal for implementing it in PostgreSQL. Could you, please, comment the proposal. Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------- Forwarded message ---------- Date: Wed, 3 Apr 2002 01:40:05 +0400 (MSD) From: Ivan E. Panchenko <ivan@xray.sai.msu.ru> To: Oleg Bartunov <oleg@sai.msu.su>, Teodor Sigaev <teodor@stack.net> Subject: Bidirectional hard joins Hi, Here we propose some essential improvement of postgreSQL functionality, which may provide a great perfomance increase. 1. Problem The fastest way to find and fetch a record from a table is to perform a SELECT ... WHERE record.id = value. Probably, an index scan would be performed for this SELECT. Such index scan seems to be fast, but there are some cases where it may appear too slow. The most evident case is the case of a sub-query, which can arise as a result of a join or a nested select statement. If it were possible to store direct references to database records in the tables, joins could be implemented in much more effective way. 2. Possible soultion Creating a datatype which stores direct reference to the record (i.e., physical location of the tuple) is only a part of the solution. When a record that is referenced is updated its physical location can be changed, so the references to it should be updated. To make this possible, the referenced record should remember all the references to itself. Thus, we consider the direct tuple references as bidirectional links, or "bidirectional hard joins". These "hard joins" are in some sense similar to hard links in a filesystem (in this analogy, classic joins are like symbolic links). Philosophically, this means a convergence between indexes and tables: a table behaives like an index for an other table. Obviously, this is is a nonrelational feature, and it requires some special SQL syntax. Below we provide some examples for clarification of the use of the proposed feature. 3. Examples CREATE JOIN paternity FROM man.children TO child.father ; -- creates a field man.children containing a reference to thetable child, and a field father in the table child with a back reference. INSERT INTO man VALUES ('Bob Scott'); INSERT INTO child VALUES ('Charles Scott'); LINK paternity WHERE (man.name = 'Bob Scott'),(child.name = 'Charles Scott'); -- Create a link betewen the two records. INSERT INTO child VALUES ('Doug Scott'); LINK paternity (man.name = 'Bob Scott'),(child.name = 'Doug Scott'); SELECT child.name from child, man WHERE paternity(man,child) AND man.name = 'Bob Scott'; -- Find all Bob's children > Charles Scott > Doug Scott > 2 records seleted. --------------------------------------------------------------- This syntax was thought of just for illustration and is not proposed to implement (now?). 4. Performance When direct joins are used in select statements, they can strongly increase performance. Let us examine the query plan of the request ("Find all Bob's children") from the example above in the present day postgres. create table man (id SERIAL,name text); create table child(id SERIAL,name text, parent_id int4 references man(id)); .. populate the tables ... and create indexes... explain selectchild.name from child, man where child.parent_id = man.id and man.name = 'Bob Scott'; Nested Loop -> Index Scan using man_n on man -> Index Scan using child_par on child In a hypotetical postgres with hard joins it could be: Nested Loop -> Index Scan using man_n on man -> Direct retrieval on child I.e., the for each retrieved "man" record we retrieve the "child" records directly using hard join. The real overhead for this operation should be neglible in comparison with index scan. Using the hard joins require some additional overhead in updates. In fact, after updating the record which takes part in such join, the references to this record in the other records should be also updated. This operation is not essentially new for postgres as similar things are done with indexes when an indexed record is updated. Hence, the overhead for updates is not greater than the overhead for updating indexes. 5. Implementation and conclusion Effective implementing of hard joins requires hard changes to postgres, most serious of them probably in the executor, where a new method "fetch record by reference" should be added in addition to "index scan" and "seq scan". Also the optimizer should be taught to deal with this. The update support is not so hard as it is similar to the updating of indexes. Though the implementation of such hard joins is really a complicated task, the performance it brings should be tremendous, so we consider discussing this important.
On Thu, 2002-04-04 at 14:17, Oleg Bartunov wrote:Subject: Bidirectional hard joins > > Hi, > > > Here we propose some essential improvement of postgreSQL functionality, > which may provide a great perfomance increase. > > 1. Problem > > The fastest way to find and fetch a record from a table is to > perform a SELECT ... WHERE record.id = value. > Probably, an index scan would be performed for this SELECT. > > Such index scan seems to be fast, but there are some cases where it may > appear too slow. The most evident case is the case of a sub-query, which > can arise as a result of a join or a nested select statement. > > If it were possible to store direct references to database > records in the tables, joins could be implemented in much more effective > way. > > 2. Possible soultion > > Creating a datatype which stores direct reference > to the record (i.e., physical location of the tuple) is only a part of the > solution. The tid type does exaclty what is needed. > > When a record that is referenced is updated its physical location can be > changed, so the references to it should be updated. To make this > possible, the referenced record should remember all the references to > itself. Thus, we consider the direct tuple references as bidirectional > links, or "bidirectional hard joins". > > These "hard joins" are in some sense similar to hard links in a > filesystem (in this analogy, classic joins are like symbolic links). > > Philosophically, this means a convergence between indexes and tables: a > table behaives like an index for an other table. > > Obviously, this is is a nonrelational feature, and it requires some > special SQL syntax. Below we provide some examples for clarification of > the use of the proposed feature. Or we can just use tid's and ordinary joins to make it a relational feature. IIRC this has been discussed on this list a few months ago. I'm not sure if bi-directional tid usage was discussed, but I can't see how to use them efficiently in a non-overwrite storage manager. ... > 4. Performance > > When direct joins are used in select statements, they can strongly > increase performance. > > Let us examine the query plan of the request ("Find all Bob's > children") from the example above in the present day postgres. > create table man (id SERIAL,name text); > create table child (id SERIAL,name text, parent_id int4 references man(id)); > .. populate the tables ... and create indexes... > explain select child.name from child, man > where child.parent_id = man.id > and man.name = 'Bob Scott'; > > Nested Loop > -> Index Scan using man_n on man > -> Index Scan using child_par on child > > > > In a hypotetical postgres with hard joins it could be: > > Nested Loop > -> Index Scan using man_n on man > -> Direct retrieval on child > > I.e., the for each retrieved "man" record we retrieve the "child" records > directly using hard join. The real overhead for this operation should be > neglible in comparison with index scan. OTOH, if index is in memory and the retrieved tuple is not then the _speed_difference_ could be neglible. > Using the hard joins require some additional overhead in updates. In fact, > after updating the record which takes part in such join, the references > to this record in the other records should be also updated. And this should be in a non-overwriting way. If we just do a standard UPDATE, causing a new heap record to be added this will result in a circle as then the original records references are not valid anymore and so also need to be updated and so on ... > This operation > is not essentially new for postgres as similar things are done with > indexes when an indexed record is updated. Hence, the overhead for updates > is not greater than the overhead for updating indexes. AFAIK indexes are not "updated" but a new index entry is added as the old tuple may be still visible to some other transaction. > 5. Implementation and conclusion > > Effective implementing of hard joins requires hard changes to postgres, > most serious of them probably in the executor, where a new method "fetch > record by reference" should be added in addition to "index scan" and "seq > scan". Also the optimizer should be taught to deal with this. > > The update support is not so hard as it is similar to the updating of > indexes. > > Though the implementation of such hard joins is really a complicated task, > the performance it brings should be tremendous, so we consider discussing > this important. Depending on usage the performance degradation can also be tremendous, as a simple update can trigger an avalance of referencing tid updates ... -------------- Hannu
Oleg Bartunov <oleg@sai.msu.su> writes: > Could you, please, comment the proposal. Okay: "ugly and unimplementable". Where are you going to put these back-references that the description glosses over so quickly? They can't be in the row itself; that doesn't scale to large numbers of references to the same row. I think you'd end up building an external datastructure that would in the final analysis offer no better performance than standard indexes. I'd also want to see an analysis of how this interacts with MVCC before we could consider whether it makes any sense in Postgres. In particular, which version of a row does the reference point at, and how will concurrent updates (possibly aborted) be handled? regards, tom lane
While I like the optimisation, the SQL syntax seems pretty horrible. Could it not be done without changing the syntax at all, except to change slightly how one defines a column? Given something like CREATE TABLE item_name ( item_id INT PRIMARY KEY, item_name VARCHAR(255) )CREATE TABLE item_set ( item_set_id INT PRIMARY KEY, item_id INT REFERENCES item_name (item_id) ON UPDATE CASCADE ON DELETE CASCADE ) it seems to me that it would be possible for the database to transparently implement this using the optimisation described. Given that, maybe one could just add another keyword to the REFERENCES statement that would actually do the reference with a "pointer"? cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC