Re: New form of index "persistent reference" - Mailing list pgsql-hackers
From | Bort, Paul |
---|---|
Subject | Re: New form of index "persistent reference" |
Date | |
Msg-id | 735D404BD9E7EB44B9CDFC27FC88809B0582D975@mail2.tmwsystems.com Whole thread Raw |
In response to | New form of index "persistent reference" (pgsql@mohawksoft.com) |
Responses |
Re: New form of index "persistent reference"
|
List | pgsql-hackers |
<p><font size="2">If that ID is the only thing you use to access that data, why not just store it in a flat file with fixed-lengthrecords? seek() (or your language's equivalent) is usually fast. </font><p><font size="2">If you need to drivethat from within PostgreSQL, you would need an untrusted language to read the file, but you could also generate it froma table using a trigger. </font><p><font size="2">Or maybe use a serial column, an index on that column, and clusterthe table on that index. It's more than one lookup, but not much with a Btree index. (Not sure if this is better thanjust using a serial and an index. <a href="http://www.postgresql.org/docs/8.0/interactive/sql-cluster.html" target="_blank">http://www.postgresql.org/docs/8.0/interactive/sql-cluster.html</a>says it isn't, if I read it correctly.)</font><p><fontsize="2">Then anytime there is a batch of updates to the table, re-cluster it.</font><p><font size="2">>-----Original Message-----</font><br /><font size="2">> From: pgsql@mohawksoft.com [<a href="mailto:pgsql@mohawksoft.com">mailto:pgsql@mohawksoft.com</a>]</font><br/><font size="2">> Sent: Thursday, February10, 2005 11:22 AM</font><br /><font size="2">> To: pgsql-hackers@postgresql.org</font><br /><font size="2">>Subject: [HACKERS] New form of index "persistent reference"</font><br /><font size="2">> </font><br /><fontsize="2">> </font><br /><font size="2">> For about 5 years now, I have been using a text search engine </font><br/><font size="2">> that I wrote</font><br /><font size="2">> and maintain.</font><br /><font size="2">></font><br /><font size="2">> In the beginning, I hacked up function mechanisms to return </font><br /><fontsize="2">> multiple value</font><br /><font size="2">> sets and columns. Then PostgreSQL aded "setof" and itis was </font><br /><font size="2">> cool. Then it</font><br /><font size="2">> was able to return a set of rows,which was even better.</font><br /><font size="2">> </font><br /><font size="2">> Lately, I have been thinkingthat a cool form of index would </font><br /><font size="2">> be some sort</font><br /><font size="2">> of"persistent reference" index. Like the old ISAM days of </font><br /><font size="2">> yore, a fixed</font><br /><fontsize="2">> number could point you right to the row that you want. I'm </font><br /><font size="2">> not sureif the</font><br /><font size="2">> "persistent reference" is a specific auto numbering column type or</font><br /><fontsize="2">> separate index structure or both.</font><br /><font size="2">> </font><br /><font size="2">> Iasked the question how do you get a record without going through an</font><br /><font size="2">> index, the answer wasCTID, which unfortunately changes when </font><br /><font size="2">> the row is</font><br /><font size="2">> updated.</font><br/><font size="2">> </font><br /><font size="2">> Now, what I want to brainstorm is some sort of "persistentreference"</font><br /><font size="2">> where the value is not algorithmically stored, maybe just an </font><br/><font size="2">> offset into a</font><br /><font size="2">> table. The number of operations should be about1 per lookup.</font><br /><font size="2">> </font><br /><font size="2">> Imagine a dynamically growing array thathas one slot per </font><br /><font size="2">> row. Every row</font><br /><font size="2">> is considered unique.Rows which are updated, their CTID is </font><br /><font size="2">> updated in the</font><br /><font size="2">>reference. (with vacuum?)</font><br /><font size="2">> </font><br /><font size="2">> Imagine somethinglike this:</font><br /><font size="2">> </font><br /><font size="2">> create table foobar(id reference, namevarchar, value varchar);</font><br /><font size="2">> </font><br /><font size="2">> select * from foobar whereid = 100;</font><br /><font size="2">> </font><br /><font size="2">> The reference type has an implicit indexthat is basically a </font><br /><font size="2">> lookup table.</font><br /><font size="2">> On unique referenceswhere the reference value is fairly </font><br /><font size="2">> arbitrary, this</font><br /><font size="2">>would be a HUGE gain for direct lookups. There is no need for </font><br /><font size="2">> the NlogN of</font><br/><font size="2">> a tree.</font><br /><font size="2">> </font><br /><font size="2">> On the surfacelevel, this would be a huge win for websites that use</font><br /><font size="2">> semi-fixed tables of data.</font><br/><font size="2">> </font><br /><font size="2">> </font><br /><font size="2">> </font><br /><fontsize="2">> ---------------------------(end of </font><br /><font size="2">> broadcast)---------------------------</font><br/><font size="2">> TIP 1: subscribe and unsubscribe commands go to </font><br/><font size="2">> majordomo@postgresql.org</font><br /><font size="2">> </font>
pgsql-hackers by date: