Re: Some ideas about Vacuum - Mailing list pgsql-hackers

From Gokulakannan Somasundaram
Subject Re: Some ideas about Vacuum
Date
Msg-id 9362e74e0801160549k226ae2b8u57f449c2efc68800@mail.gmail.com
Whole thread Raw
In response to Re: Some ideas about Vacuum  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
List pgsql-hackers

I haven't been paying close attention to this thread, but there is a
couple general issues with using the WAL for this kind of things. First
of all, one extremely cool feature of PostgreSQL is that transaction
size is not limited by WAL space, unlike on many other DBMSs. I think
many of the proposed ideas of reading WAL would require us to keep all
WAL available back to the beginning of the oldest running transaction.

Initially i thought this may be required. But the current idea is Vacuum is going to maintain a DSM per relation and it will update it, once the WAL segement is switched. so if the WAL logging is happening at segment 2, then the first segment will be scanned to update the DSM.

Another issue is that reading WAL is inherently not very scalable.
There's only one WAL for the whole cluster, and it needs to be read
sequentially, so it can easily become a bottleneck on large systems.

Let me try to understand what would become a problem here. We are going to have only one process, which would open this WAL (one segment at a time) and update the DSMs. The limitation would be that we should have completed reading the log before the WAL segment round-up. What else do you think would be the problem?

Thanks,
Gokul.

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Password policy
Next
From: Alvaro Herrera
Date:
Subject: Re: Some ideas about Vacuum