Re: Performance (was: The New Slashdot Setup (includes MySql server)) - Mailing list pgsql-hackers

From Michael A. Olson
Subject Re: Performance (was: The New Slashdot Setup (includes MySql server))
Date
Msg-id 200005191420.HAA86131@triplerock.olsons.net
Whole thread Raw
In response to Re: Performance (was: The New Slashdot Setup (includes MySql server))  ("Matthias Urlichs" <smurf@noris.net>)
Responses Re: Performance (was: The New Slashdot Setup (includes MySql server))  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Performance (was: The New Slashdot Setup (includes MySql server))  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
At 02:04 PM 5/19/00 +0200, you wrote:

> Well, it's reasonable that you can't keep an index on the table which
> states what the indices are. ;-)
> 
> ... on the other hand, Apple's HFS file system stores all the information
> about the on-disk locations of their files as a B-Tree in, in, you
> guessed it, a B-Tree which is saved on disk as an (invisible) file.
> Thus, the thing stores the information on where its sectors are located
> at, inside itself.
> To escape this catch-22 situation, the location of the first three
> extents (which is usually all it takes anyway) is stored elsewhere.
> 
> Possibly, something like this would work with postgres too.

This is one of several things we did at Illustra to make the backend
run faster.  I did the design and implementation, but it was a few
years ago, so the details are hazy.  Here's what I remember.

We had to solve three problems:

First, you had to be able to run initdb and bootstrap the system
without the index on pg_index in place.  As I recall, we had to
carefully order the creation of the first several tables to make
that work, but it wasn't rocket science.

Second, when the index on pg_index gets created, you need to update
it with tuples that describe it.  This is really just the same as
hard-coding the pg_attribute attribute entries into pg_attribute --
ugly, but not that bad.

Third, we had to abstract a lot of the hard-coded table scans in
the bowels of the system to call a routine that checked for the
existence of an index on the system table, and used it.  In order
for the index on pg_index to get used, its reldesc had to be nailed
in the cache.  Getting it there at startup was more hard-coded
ugliness, but you only had do to it one time.

The advantage is that you can then index a bunch more of the system
catalog tables, and on a bunch more attributes.  That produced some
surprising speedups.

This was simple enough that I'm certain the same technique would
work in the current engine.
                mike



pgsql-hackers by date:

Previous
From: Chris
Date:
Subject: Re: Performance (was: The New Slashdot Setup (includes MySql server))
Next
From: Tom Lane
Date:
Subject: Re: Performance (was: The New Slashdot Setup (includes MySql server))