Re: Contrib -- PostgreSQL shared variables - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Contrib -- PostgreSQL shared variables
Date
Msg-id Pine.OSF.4.60.0408282002160.230146@kosh.hut.fi
Whole thread Raw
In response to Re: Contrib -- PostgreSQL shared variables  (pgsql@mohawksoft.com)
Responses Re: Contrib -- PostgreSQL shared variables
List pgsql-hackers
On Sat, 28 Aug 2004 pgsql@mohawksoft.com wrote:

>
>> I don't see how this is different from "CREATE TABLE shared_variables
>> (name
>> VARCHAR PRIMARY KEY, value VARCHAR)" and
>> inserting/updating/deleting/selecting from that. Perhaps these are
>> per-session shared variables? IN which case, what is the utility if
>> sharing
>> them across shared memory?
>>
>> - --
>> Jonathan Gardner
>
> Well, the issues you don't see is this:
>
> What if you have to update the variables [n] times a second?
>
> You have to vacuum very frequently. If you update a variable a hundred
> times a second, and vacuum only once every minute, the time it takes to
> update ranges from reading one row from the database to reading 5999 dead
> rows to get to the live one. Then you vacuum, then you are back to one row
> again.

I think the right approach is to tackle that problem instead of working 
around it with a completely new variable mechanism.

I've been playing with the idea of a quick vacuum that runs through the 
shmem buffers. The idea is that since the pages are already in memory, 
the vacuum runs very quickly. Vacuuming the hot pages frequently 
enough should avoid the problem you describe. It also saves I/O in the 
long run since dirty pages are vacuumed before they are written to 
disk, eliminating the need to read in, vacuum and write the same pages 
again later.

The problem is of course that to vacuum the heap pages, you have to make 
sure that there is no references to the dead tuples from any indexes.

The trivial case is that the table has no indexes. But I believe that 
even if the table has ONE index, it's very probable that the corresponding 
index pages of the dead tuples are also in memory, since the tuple was 
probably accessed through the index.

As the number of indexes gets bigger, the chances of all corresponding 
index pages being in memory gets smaller.

If the "quick" vacuum or opportunistic vacuum as I call it is clever 
enough to recognize that there is a dead tuple in memory, and all the 
index pages that references are in memory too, it could reliably vacuum 
just those tuples without scanning through the whole relation and without 
doing any extra I/O.

I've written some code that implements the trivial case of no indexes. I'm 
hoping to extend it to handle the indexes too if I have time. Then we'll 
see if it's any good. I've attached a patch with my current ugly 
implementation if you want to give it a try.

> On top of that, all the WAL logging that has to take place for each
> "transaction."

How is that a bad thing? You don't want to give up ACID do you?

- Heikki

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Regression test failures
Next
From: Tom Lane
Date:
Subject: Re: Compile failure in CVS HEAD