Thread: Re: [HACKERS] Fwd: Joins and links

Re: [HACKERS] Fwd: Joins and links

From
Bob Devine
Date:
Leon <leon@udmnet.ru> wrote:
> Earlier I proposed that links should be of type similar to tid,
> so inserts should be fed with values of tid. But this requires
> intermediate step, so there can be a function which takes primary
> key and returns tid, or as you say a function 
> last_touched('other_table_name') - this seems the best choice.

Beware of adding special purpose hard-links as a way to
skip the run-time value comparisons.  A link looks attractive
but it really only works for one-to-one relationships
(any multi-way relationships would require a list of links
to follow) and a link has all of the overhead that a
foreign key requires.

As somone who has developed several commercial dbms systems,
I would discourage doing a special "link" type.  There are
other ways to gain performance -- de-normalize your tables
if you are doing mainly reads; carefully check your storage
layout; and, of course, buy more RAM ;-)

--
Bob Devine  devine@cs.utah.edu


Re: [HACKERS] Fwd: Joins and links

From
Bruce Momjian
Date:
> Leon <leon@udmnet.ru> wrote:
> > Earlier I proposed that links should be of type similar to tid,
> > so inserts should be fed with values of tid. But this requires
> > intermediate step, so there can be a function which takes primary
> > key and returns tid, or as you say a function 
> > last_touched('other_table_name') - this seems the best choice.
> 
> Beware of adding special purpose hard-links as a way to
> skip the run-time value comparisons.  A link looks attractive
> but it really only works for one-to-one relationships
> (any multi-way relationships would require a list of links
> to follow) and a link has all of the overhead that a
> foreign key requires.
> 
> As somone who has developed several commercial dbms systems,
> I would discourage doing a special "link" type.  There are
> other ways to gain performance -- de-normalize your tables
> if you are doing mainly reads; carefully check your storage
> layout; and, of course, buy more RAM ;-)

Good to see you around Bob.  This guy does know what he is talking
about.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bob Devine wrote:

> Beware of adding special purpose hard-links as a way to
> skip the run-time value comparisons.  A link looks attractive
> but it really only works for one-to-one relationships
> (any multi-way relationships would require a list of links
> to follow) 

Not exactly. If you have a fixed set of links it a tuple, you
don't have to follow the list of them.

> and a link has all of the overhead that a
> foreign key requires.
> 

We looked at the matter carefully and found no overhead like
foregn key's. Maybe you should read the thread more carefully
once again.

> As somone who has developed several commercial dbms systems,
> I would discourage doing a special "link" type.  There are
> other ways to gain performance -- de-normalize your tables
> if you are doing mainly reads;

If I denormalize my tables, they will grow some five to ten 
times in size.

But simply think what you are proposing: you are proposing 
exactly to break RDBMS "alphabet" to gain performance! This
means that even SQL warriors see RDBMS's ideology as not 
proper and as corrupt, because it hinders performance. 

You are contradicting yourself! 

> carefully check your storage
> layout; and, of course, buy more RAM ;-)

And what will I do with performance loss from bloated tables?

-- 
Leon.




Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bruce Momjian wrote:

> >
> > As somone who has developed several commercial dbms systems,
> > I would discourage doing a special "link" type.  There are
> > other ways to gain performance -- de-normalize your tables
> > if you are doing mainly reads; carefully check your storage
> > layout; and, of course, buy more RAM ;-)
> 
> Good to see you around Bob.  This guy does know what he is talking
> about.

Believe me, I know what I say. Some day I spoke exactly like you,
but having seen an impementation of network DBMS, I suddenly
realized that SQL days are numbered. The sooner you understand that
the better.

-- 
Leon.





Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bob Devine wrote:

> As somone who has developed several commercial dbms systems,
> I would discourage doing a special "link" type.

Of course you tried to implement links and failed, didn't you?
It such case personally I and maybe others want to hear what
can go wrong, in order to benefit from your mistakes and lessons. 

-- 
Leon.



Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bruce Momjian wrote:

> > As somone who has developed several commercial dbms systems,
> > I would discourage doing a special "link" type.  There are
> > other ways to gain performance -- de-normalize your tables
> > if you are doing mainly reads; carefully check your storage
> > layout; and, of course, buy more RAM ;-)
> 
> Good to see you around Bob.  This guy does know what he is talking
> about.
> 

After thinking a bit, it became clear to me that we are flaming 
senselessly here. So can anyone do a fast hack to test links
for speed? Especially with three or more tables being joined.

-- 
Leon.




Re: [HACKERS] Fwd: Joins and links

From
Bob Devine
Date:
Leon wrote:
> If I denormalize my tables, they will grow some five to ten
> times in size.
> 
> But simply think what you are proposing: you are proposing
> exactly to break RDBMS "alphabet" to gain performance! This
> means that even SQL warriors see RDBMS's ideology as not
> proper and as corrupt, because it hinders performance.

and he wrote:
> After thinking a bit, it became clear to me that we are flaming 
> senselessly here. So can anyone do a fast hack to test links
> for speed? Especially with three or more tables being joined.

and in another message:
> Of course you tried to implement links and failed, didn't you?
> It such case personally I and maybe others want to hear what
> can go wrong, in order to benefit from your mistakes and lessons. 

It's a good idea to test it out.  My guess is that a hard link
between tables would speed up the join a small amount.

The bigger drawbacks are:
1) the application design is now encoded in the database structure.
Using link forces your _one_ application's need to affect all other
users of that table.  Each affected table would be bloated with
at least one more column.  All updates now affect multiple tables
leading to more locking, paging, and synchronization overhead. Etc.

2) adding performance tweaks for a version condemns you to always
be aware of it for future versions.  I know of many cases where
people now hate the idea of a database performance "improvement"
that has prevented them from modifying the database schema.
One person's company is still using a database that everyone hates
because one critical application prevents them from changing it.
Indexes are about the only useful physical level hack that have
survived the test of time.  An index is not part of relational
databases but are universally implemented because they yield
a huge payback.

3) Be aware of hardware improvements.  System performance is
still doubling every 18 months.  If a software hack can't match
that rate, it is probably not worth doing.


In my experience, old style network and hierarchical databases
are still faster than relational systems.  Just like OO DBMSs
can be faster.  However, the non-relational databases gain their
speed by optimizing for a single application instead of being a
more general purpose approach.  Nearly every company that uses
databases realizes that flexibility is more important than a bit
more speed unless that business has already maxed out their
computer's performance and are desparate for that extract bit.

It is my many years of watching databases in use that suggest
that links are not worth the overhead.  My gut feeling is that
links would speed up a simple join by only 10% and there are
many other ways to speed up joins.

--
Bob Devine  devine@cs.utah.edu


Re: [HACKERS] Fwd: Joins and links

From
Bruce Momjian
Date:
[Charset koi8-r unsupported, filtering to ASCII...]
> Bruce Momjian wrote:
> 
> > > As somone who has developed several commercial dbms systems,
> > > I would discourage doing a special "link" type.  There are
> > > other ways to gain performance -- de-normalize your tables
> > > if you are doing mainly reads; carefully check your storage
> > > layout; and, of course, buy more RAM ;-)
> > 
> > Good to see you around Bob.  This guy does know what he is talking
> > about.

No.  I wasn't flaming, just confirming that he has lots of experience.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bob Devine wrote:

> The bigger drawbacks are:
> 1) the application design is now encoded in the database structure.

This is true.

> Using link forces your _one_ application's need to affect all other
> users of that table.  Each affected table would be bloated with
> at least one more column.

In fact link is intended to replace foreign key in a given table 
and not coexist with it. Given that it eliminates the need of 
index, there is even a small space gain.

>  All updates now affect multiple tables
> leading to more locking, paging, and synchronization overhead. Etc.

Oh, no :)  After a short discussion it became clear that there
must not be a link rewrite in a referencing table during update. 
So update goes as usual, involving only one table. Instead we have 
a chain of referenced tuples left after update. VACUUM eliminates
these. 

> 
> 2) adding performance tweaks for a version condemns you to always
> be aware of it for future versions.

Absolutely right. If we started a talk on general matters, let me 
clear my position. 

Every tool is suitable for it's purpose. No one walks from city
to city and uses car instead. And no one takes a car to get into
neighbor's home for evening tea :)  So. There are tasks of 
different kind. Some are flexible and require redesigning of 
relationships often. But there are other, which are well known
and explored well, and have well known structure. Accounting is
some of them. There are a lot others, without doubt. What is 
proposed is a tool to handle tasks of the second sort effectively,
since general RDBMS is a tool for other, flexible tasks. This is a 
matter of design and designer's job to choose the right tool.
If designer made a wrong choice, it is a problem of him an his
kicked ass. You should give designer as many tools as possible 
and let him choose. They will love you for that :)


> 3) Be aware of hardware improvements.  System performance is
> still doubling every 18 months.  If a software hack can't match
> that rate, it is probably not worth doing.

Oh, that argument again :)  I'll tell you - sooner or later
this development will stop. There are purely physical obstacles
that prevent manufacturing of silicon chips with frequencies much
higher than 10 gigahertz.

> It is my many years of watching databases in use that suggest
> that links are not worth the overhead.  My gut feeling is that
> links would speed up a simple join by only 10% and there are
> many other ways to speed up joins.

Let's count. We have two tables, joined by link. What is the
cost of lookup? First there is an index scan, which is between
2 and 5 iterations, and link lookup which is 1 iteration. Average
is 4 iterations. And if we don't have link, there is 6 iterations.
More than 10% already! We still didn't consider joining multiple
tables and big tables. So the gain will be big anyway.

That is not to consider the optimizer (do I sound like a broken
record? :)  To be sincere, current Postgres optimizer sucks heavily
and in most cases can't figure out the fastest way. Implementing
links is a quick and cheap way to get a performance gain on 
a wide range of tasks. I am obliged to repeat this again and again, 
because every day there appears a new developer who didn't hear
that yet :)

-- 
Leon.



Re: [HACKERS] Fwd: Joins and links

From
Tom Ivar Helbekkmo
Date:
Leon <leon@udmnet.ru> writes:

> > 3) Be aware of hardware improvements.  System performance is
> > still doubling every 18 months.  If a software hack can't match
> > that rate, it is probably not worth doing.
> 
> Oh, that argument again :)  I'll tell you - sooner or later
> this development will stop. There are purely physical obstacles
> that prevent manufacturing of silicon chips with frequencies much
> higher than 10 gigahertz.

Furthermore, the continuous availability of ever faster hardware at
low prices will slow down very soon, now that the MS Windows users
finally don't need to upgrade to twice as fast computers every 18
months just to be able to run the latest version of MS bloatware, and
will spend their money on peripherals and fast net access instead.

...but as for "purely physical obstacles", I don't buy it.  We will
always find a way to make what we need.  Count on it.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"


Re: [HACKERS] Fwd: Joins and links

From
Tom Ivar Helbekkmo
Date:
Bob Devine <devine@cs.utah.edu> writes:

> 3) Be aware of hardware improvements.  System performance is
> still doubling every 18 months.  If a software hack can't match
> that rate, it is probably not worth doing.

I like Kernighan's and Pike's argument, presented in their recent
book, The Practice of Programming, that if a software improvement is
expected to save more accumulated user time than the programmer time
spent making it, then it should be considered worthwhile.

Great book, by the way.  Highly recommended.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"


Re: [HACKERS] Fwd: Joins and links

From
Hannu Krosing
Date:
Leon wrote:
> 
> Bob Devine wrote:
> 
> > It is my many years of watching databases in use that suggest
> > that links are not worth the overhead.  My gut feeling is that
> > links would speed up a simple join by only 10% and there are
> > many other ways to speed up joins.
> 
> Let's count. We have two tables, joined by link. What is the
> cost of lookup? First there is an index scan, which is between
> 2 and 5 iterations, and link lookup which is 1 iteration. Average
> is 4 iterations.

This is true for the case wher you want to look up only one row.

The difference will quickly degrade as more rows are fetched in one 
query and cache misses and disk head movement start rattling your 
disks. The analogy being a man who needs 10 different items from a 
supermarket and takes 10 full round trips from home to buy them.

> And if we don't have link, there is 6 iterations.
> More than 10% already! We still didn't consider joining multiple
> tables and big tables.

I think that the two-tables-one-row lookup will gain the most, 
probably even more than 10%

> So the gain will be big anyway.
> 
> That is not to consider the optimizer (do I sound like a broken
> record? :)  To be sincere, current Postgres optimizer sucks heavily
> and in most cases can't figure out the fastest way.

Adding links does nothing to improve the optimizer, its still free 
to choose sucky plans. It is possible that links are faster if used 
in the right way, as they cut out the index lookup, but I suspect that 
hard-coding link-is-always-faster into the optimiser would also produce 
a lot of very bad plans. 

The link-is-always-faster is probably true only for all-memory
databases, 
and even there not allways - for example if it happened to produce a
worse 
initial ordering for sort/group by than some other strategy, a complex 
query can still run slower (the difference will be small either way)

> Implementing links is a quick and cheap way to get a performance 
> gain on a wide range of tasks.

Fixing the optimizer would get a performance gain on a far wider 
range of tasks, and is still needed for links.

> I am obliged to repeat this again and again,
> because every day there appears a new developer who didn't hear
> that yet :)

Unfortunaltely there are far less _developers_ than letter-writers, and
it
is sometimes quite hard to make them even commit good and useful patches 
that are ready.

So I quess thet if you want links in foreseeable future, your best bet 
would be to start coding, and to coordinate with whoever starts to
fix/rewrite
the optimizer (probably Vadim)

(BTW, in PostgreSQL, I still consider myself a letter-writer and not 
developer, as I have committed no code for the backend)

-------------
Hannu


Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Hannu Krosing wrote:

> The difference will quickly degrade as more rows are fetched in one
> query and cache misses and disk head movement start rattling your
> disks. The analogy being a man who needs 10 different items from a
> supermarket and takes 10 full round trips from home to buy them.

Frankly, I didn't even consider fetching database from disk. This
slows queries immensely and I wonder if there exist someone who
doesn't keep their entire DB in RAM.

> I think that the two-tables-one-row lookup will gain the most,
> probably even more than 10%

I think the gain will raise with the number of tables, because
the more tables - the more index lookups are saved.

> Adding links does nothing to improve the optimizer, its still free
> to choose sucky plans. It is possible that links are faster if used
> in the right way, as they cut out the index lookup, but I suspect that
> hard-coding link-is-always-faster into the optimiser would also produce
> a lot of very bad plans.

Methinks that hard-wiring link-is-always-faster into optimizer will
still help it very much, because there are few situations where it
is not true.

> Fixing the optimizer would get a performance gain on a far wider
> range of tasks, and is still needed for links.

But general fixing of optimizer is total rewritement of it, whereas
link fix is almost a fast hack.

> So I quess thet if you want links in foreseeable future, your best bet
> would be to start coding, and to coordinate with whoever starts to
> fix/rewrite
> the optimizer (probably Vadim)
>

Unfortunately I already have a project to work on. There is too 
little of me for two projects.

-- 
Leon.




Re: [HACKERS] Fwd: Joins and links

From
Hannu Krosing
Date:
Leon wrote:
> 
> Hannu Krosing wrote:
> 
> > The difference will quickly degrade as more rows are fetched in one
> > query and cache misses and disk head movement start rattling your
> > disks. The analogy being a man who needs 10 different items from a
> > supermarket and takes 10 full round trips from home to buy them.
> 
> Frankly, I didn't even consider fetching database from disk. This
> slows queries immensely and I wonder if there exist someone who
> doesn't keep their entire DB in RAM.

Well, I personally dont even know, how I could keep my entire PostgreSQL 
DB in RAM :)

It would be interesting to know what percentage of people do use 
PostgreSQL for databases that are small enough to fit in RAM - 
surely not the ones who need splitting of tables >2GB.

And I think that setting up PostgreSQL for maximum RAM usage would 
make a nice topic for "Optimizing PostgreSQL". 

When my backends are mostly idle they usually use about 3-4MB of memory, 
(hardly enough for any database :).

It is quite possible that some bigger tables end up in a disk-cache, 
but you can expect to find all your data in that cache only if you do 
many queries on the same tables in a row, and the machine is otherways 
idle.

> I think the gain will raise with the number of tables, because
> the more tables - the more index lookups are saved.

My point is that sometimes even sequential scan is faster than index
lookup,
and not only due to overhead of using the index, but due to better disk 
performance of sequential reads vs. random reads

For in-memory databases this of course does not count.

Still I'm quite sure that the main effort in PostgreSQL development has
so 
far gone to optimising queries where most of the data is fetched from
the 
disk.

> > Fixing the optimizer would get a performance gain on a far wider
> > range of tasks, and is still needed for links.
> 
> But general fixing of optimizer is total rewritement of it, whereas
> link fix is almost a fast hack.

I'm not too sure about it. It certainly can be done without a _total_ 
rewrite, but getting all the new node types and access methods into the 
parser/planner/executor may not be trivial. 

One idea would be a cross-table OID index for anything in memory.
Then, assuming that everything is in-memory, using oids as links would
be 
only trivially, if at all, slower (2-10 memory accesses and comparisons) 
than "straight" link lookup, that could also be chasing linear chains of 
forward-id-links on frequently updated DBs. On infrequently updated DBs 
you could just use triggers and/or cron jobs to keep your reports
updated,
I quess that this is what most commercial OLAP systems do.

Actually I lived my first halfyear of using PostgreSQL under a delusion 
that lookup by OID would be somewhat special (fast). 
Probably due to my misunderstanding of the (ever diminishing) O in
ORDBMS :) 
There have been some attempts to get the object-orientedness better 
supported by PGSQL, (possibly even some contrib funtions), but nobody
seems 
to have needed it bad enough to
1) implement itand 
2) shout long enough to get it included in standart distibution. 
Most (all?) of the developers seem to be die-hard RDBMS guys and thanks 
to that we have now a solid and reasonably fast Relational DBMS with 
some OO rudiments

So I quess that unless you do (at least part of) links, no-one else will
;(

> Unfortunately I already have a project to work on. There is too
> little of me for two projects.

Darn! I almost hoped we would get one more PostgreSQL hacker as I'm sure 
that after familiarising oneself with the code enougth to implement
links 
one would be quite capable of helping with most of PostgreSQL
development
<grin>

----------------------
Hannu


Re[2]: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Hello Hannu,

Friday, July 09, 1999 you wrote:


H> Still I'm quite sure that the main effort in PostgreSQL development has
H> so
H> far gone to optimising queries where most of the data is fetched from
H> the
H> disk.

Oh, I see. This is appropriate for some not time critical
and dumb applications, such as web DB. But this is out of the
way of speed server tasks. Maybe Postgres has been designed with
such plan in mind - to use big DBs from disc? That is not
good news for me either. Almost everyone has suggested me
to use more RAM to speed up queries, and now it turned out to
be not in Postgres's mainstream. Maybe there is something
wrong with this ideology, since RAM is bigger and cheaper
every day?

H> forward-id-links on frequently updated DBs. On infrequently updated DBs
H> you could just use triggers and/or cron jobs to keep your reports
H> updated,
H> I quess that this is what most commercial OLAP systems do.

It seems that trigger will be the last resort.

Best regards, Leon




Re: [HACKERS] Fwd: Joins and links

From
Bruce Momjian
Date:
> Unfortunaltely there are far less _developers_ than letter-writers, and
> it
> is sometimes quite hard to make them even commit good and useful patches 
> that are ready.
> 
> So I quess thet if you want links in foreseeable future, your best bet 
> would be to start coding, and to coordinate with whoever starts to
> fix/rewrite
> the optimizer (probably Vadim)

Are people complaining about the 6.5 optimizer, or the pre-6.5
optimizer?  6.5 has a much improved optimizer.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Fwd: Joins and links

From
Maarten Boekhold
Date:

Hannu Krosing wrote: 
> Leon wrote:

> > Frankly, I didn't even consider fetching database from disk. This
> > slows queries immensely and I wonder if there exist someone who
> > doesn't keep their entire DB in RAM.
> 
> Well, I personally dont even know, how I could keep my entire PostgreSQL
> DB in RAM :)

I thought about doing this once on a Linux box. What I was thinking about was
creating a large RAM disk, and use that disk together with a physical drive in
a mirror setup. However, I was never able to create a large enough RAM disk back then
(must have been like LInux 2.0), and also the RAID mirror code wasn't able to
support such a mix of devices (i.e. RAM disk + physical disk). The situation might
have changed by now.

Maarten

-- 

Maarten Boekhold, boekhold@tibco.com
TIBCO Finance Technology Inc.
The Atrium
Strawinskylaan 3051
1077 ZX Amsterdam, The Netherlands
tel: +31 20 3012158, fax: +31 20 3012358
http://www.tibco.com



Re: [HACKERS] Fwd: Joins and links

From
Brian E Gallew
Date:
Then <boekhold@tibco.com> spoke up and said:
> Hannu Krosing wrote: 
> > Leon wrote:
> > > Frankly, I didn't even consider fetching database from disk. This
> > > slows queries immensely and I wonder if there exist someone who
> > > doesn't keep their entire DB in RAM.
> > 
> > Well, I personally dont even know, how I could keep my entire PostgreSQL
> > DB in RAM :)
> 
> I thought about doing this once on a Linux box. What I was thinking about was
> creating a large RAM disk, and use that disk together with a physical drive in
> a mirror setup. However, I was never able to create a large enough RAM disk back then
> (must have been like LInux 2.0), and also the RAID mirror code wasn't able to
> support such a mix of devices (i.e. RAM disk + physical disk). The situation might
> have changed by now.

Maarten, PostgreSQL keeps it's data in the filesystem, rather than on
raw disks.  Due to the nature of *nix, all you need to do to keep your
entire DB in memory is have enough memory.  The buffer cache will do
the rest, for you.  Of course, you still need to start it up with -F
to avoid fsync's.  This is also somewhat OS dependent, as you may have
to do some tuning to allow full memory utilization in this manner.


-- 
=====================================================================
| JAVA must have been developed in the wilds of West Virginia.      |
| After all, why else would it support only single inheritance??    |
=====================================================================
| Finger geek@cmu.edu for my public key.                            |
=====================================================================

Re: [HACKERS] Fwd: Joins and links

From
Maarten Boekhold
Date:
> > > Well, I personally dont even know, how I could keep my entire PostgreSQL
> > > DB in RAM :)
> >
> > I thought about doing this once on a Linux box. What I was thinking about was
> > creating a large RAM disk, and use that disk together with a physical drive in
> > a mirror setup. However, I was never able to create a large enough RAM disk back then
[...] 
> Maarten, PostgreSQL keeps it's data in the filesystem, rather than on
> raw disks.  Due to the nature of *nix, all you need to do to keep your
> entire DB in memory is have enough memory.  The buffer cache will do
> the rest, for you.  Of course, you still need to start it up with -F

I know, but there's no *guarantee* that the complete database is going to be in RAM.
That's what I was trying to solve. Putting the thing on a RAM disk would guarantee that
it is.

Maarten

-- 

Maarten Boekhold, boekhold@tibco.com
TIBCO Finance Technology Inc.
The Atrium
Strawinskylaan 3051
1077 ZX Amsterdam, The Netherlands
tel: +31 20 3012158, fax: +31 20 3012358
http://www.tibco.com


Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bob Devine wrote:

> Beware of adding special purpose hard-links as a way to
> skip the run-time value comparisons.  A link looks attractive
> but it really only works for one-to-one relationships
> (any multi-way relationships would require a list of links
> to follow) 

Not exactly. If you have a fixed set of links it a tuple, you
don't have to follow the list of them.

> and a link has all of the overhead that a
> foreign key requires.
> 

We looked at the matter carefully and found no overhead like
foregn key's. Maybe you should read the thread more carefully
once again.

> As somone who has developed several commercial dbms systems,
> I would discourage doing a special "link" type.  There are
> other ways to gain performance -- de-normalize your tables
> if you are doing mainly reads;

If I denormalize my tables, they will grow some five to ten 
times in size.

But simply think what you are proposing: you are proposing 
exactly to break RDBMS "alphabet" to gain performance! This
means that even SQL warriors see RDBMS's ideology as not 
proper and as corrupt, because it hinders performance. 

You are contradicting yourself! 

> carefully check your storage
> layout; and, of course, buy more RAM ;-)

And what will I do with performance loss from bloated tables?

-- 
Leon.




Re: [HACKERS] Fwd: Joins and links

From
Tom Ivar Helbekkmo
Date:
Bob Devine <devine@cs.utah.edu> writes:

> 3) Be aware of hardware improvements.  System performance is
> still doubling every 18 months.  If a software hack can't match
> that rate, it is probably not worth doing.

I like Kernighan's and Pike's argument, presented in their recent
book, The Practice of Programming, that if a software improvement is
expected to save more accumulated user time than the programmer time
spent making it, then it should be considered worthwhile.

Great book, by the way.  Highly recommended.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"


Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Hannu Krosing wrote:

> The difference will quickly degrade as more rows are fetched in one
> query and cache misses and disk head movement start rattling your
> disks. The analogy being a man who needs 10 different items from a
> supermarket and takes 10 full round trips from home to buy them.

Frankly, I didn't even consider fetching database from disk. This
slows queries immensely and I wonder if there exist someone who
doesn't keep their entire DB in RAM.

> I think that the two-tables-one-row lookup will gain the most,
> probably even more than 10%

I think the gain will raise with the number of tables, because
the more tables - the more index lookups are saved.

> Adding links does nothing to improve the optimizer, its still free
> to choose sucky plans. It is possible that links are faster if used
> in the right way, as they cut out the index lookup, but I suspect that
> hard-coding link-is-always-faster into the optimiser would also produce
> a lot of very bad plans.

Methinks that hard-wiring link-is-always-faster into optimizer will
still help it very much, because there are few situations where it
is not true.

> Fixing the optimizer would get a performance gain on a far wider
> range of tasks, and is still needed for links.

But general fixing of optimizer is total rewritement of it, whereas
link fix is almost a fast hack.

> So I quess thet if you want links in foreseeable future, your best bet
> would be to start coding, and to coordinate with whoever starts to
> fix/rewrite
> the optimizer (probably Vadim)
>

Unfortunately I already have a project to work on. There is too 
little of me for two projects.

-- 
Leon.




Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bruce Momjian wrote:

> >
> > As somone who has developed several commercial dbms systems,
> > I would discourage doing a special "link" type.  There are
> > other ways to gain performance -- de-normalize your tables
> > if you are doing mainly reads; carefully check your storage
> > layout; and, of course, buy more RAM ;-)
> 
> Good to see you around Bob.  This guy does know what he is talking
> about.

Believe me, I know what I say. Some day I spoke exactly like you,
but having seen an impementation of network DBMS, I suddenly
realized that SQL days are numbered. The sooner you understand that
the better.

-- 
Leon.





Re: [HACKERS] Fwd: Joins and links

From
Leon
Date:
Bob Devine wrote:

> As somone who has developed several commercial dbms systems,
> I would discourage doing a special "link" type.

Of course you tried to implement links and failed, didn't you?
It such case personally I and maybe others want to hear what
can go wrong, in order to benefit from your mistakes and lessons. 

-- 
Leon.