Re: crash-safe visibility map, take three - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: crash-safe visibility map, take three |
Date | |
Msg-id | AANLkTikNcbySP_HDS0ZoaEWmaA=JBRWhssstD7xTSmNc@mail.gmail.com Whole thread Raw |
In response to | Re: crash-safe visibility map, take three (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: crash-safe visibility map, take three
|
List | pgsql-hackers |
On Thu, Dec 2, 2010 at 2:01 PM, Jeff Davis <pgsql@j-davis.com> wrote: > * We don't get an exclusive lock when dirtying a page with hint bits > - Why: we write while reading, and we want good concurrency. > - Why': because after a bulk load, we don't have any hint bits, and the > only way to get them set without VACUUM is to write while reading. I've > never been entirely sure why VACUUM isn't good enough in this case, > aside from the fact that a user might not run VACUUM (and autovacuum > might not either, if it was only a bulk load and no updates/deletes). > > * We don't WAL log setting hint bits (which dirties a page) > - Why: because after a bulk load, we don't want to write the data a 4th > time > > Hypothetically, if we had a bulk loading strategy, these problems would > go away, and we could follow the rules. Right? Is there a case other > than bulk loading which demands that we break these rules? I'm not really convinced that this problem is confined to bulk loading. Every INSERT or UPDATE results in a new tuple that may need hit bits set and eventually to be frozen. A bulk load is just a time when you do lots of inserts all at once; it seems to me that a large update would cause all the same problems, plus bloat. The triple I/O problem exists for small transactions as well (and isn't desirable there either); it's just less noticeable because the second and third writes are, like the first one, small. > And, if we had a bulk loading path, we could probably get away with > writing the data only twice (today, we write it 3 times including the > hint bits) or maybe once if WAL archiving is off. It seems to me that a COPY command executed in a transaction with no other open snapshots writing to a table created or truncated within the same transaction should be able to write frozen tuples from the get-go, regardless of anything else we do. > So, is there a case other than bulk loading for which we need to break > these rules? If not, perhaps we should consider bulk loading a different > problem, and simplify the design of all of these other features (and > allow new storage-touching features to come about, like CRCs, without > exponentially increasing the complexity with each one). I don't think we're exponentially increasing complexity - I think we're incrementally improving our algorithms. If you want to propose a bulk loading path, great. Propose away! But without something a bit more concrete, I don't think it would be appropriate to hold off making the visibility map crash-safe, on the off chance that our design for so doing might complicate something else we want to do later. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: