Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization |
Date | |
Msg-id | 20140102191643.GA2542@awork2.anarazel.de Whole thread Raw |
In response to | Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization (Mark Dilger <markdilger@yahoo.com>) |
Responses |
Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization
|
List | pgsql-hackers |
On 2014-01-02 10:18:52 -0800, Mark Dilger wrote: > I anticipated that my proposal would require partitioning the catalogs. > For instance, autovacuum could only run on locally owned tables, and > would need to store the analyze stats data in a catalog partition belonging > to the local server, but that doesn't seem like a fundamental barrier to > it working. It would make every catalog lookup noticeably more expensive. > The partitioned catalog tables would get replicated like > everything else. The code that needs to open catalogs and look things > up could open the specific catalog partition needed if it already knew the > Oid of the table/index/whatever that it was interested in, as the catalog > partition desired would have the same modulus as the Oid of the object > being researched. Far, far, far from every lookup is by oid. Most prominently the names of database objects. Those will have to scan every catalog partition. Not fun. > Your point about increasing the runtime of pg_upgrade is taken. I will > need to think about that some more. It's not about increasing the runtime, it's about simply breaking it. pg_upgrade relies on binary compatibility of user relation's files and you're breaking that if you change the width of datatypes. > Your claim that what I describe is not multi-master is at least partially > correct, depending on how you think about the word "master". Certainly > every server is the master of its own chunk. Well, you're essentially just describing a sharded system - that's not usually coined multimaster. > Your claim that BDR doesn't have to be much slower than what I am > proposing is quite interesting, as if that is true I can ditch this idea and > use BDR instead. It is hard to empirically test, though, as I don't have > the alternate implementation on hand. Well, I can tell you that for the changeset extraction stuff (which is the basis for BDR) the biggest bottleneck so far seems to be the CRC computation when reading the WAL - and that's something plain WAL apply has to do as well. And it is optimizable. When actually testing decoding & apply, for workloads fitting into memory I had to try very hard to construe situations where apply was a big bottleneck. It is easier for seek bound workloads, where the standby is less powerful than the primary, since there's more random reads for those due to full page writes removing the need for reads in many cases. > I think the expectation that performance will be harmed if postgres > uses 8 byte Oids is not quite correct. > > Several years ago I ported postgresql sources to use 64bit everything. > Oids, varlena headers, variables tracking offsets, etc. It was a fair > amount of work, but all the doom and gloom predictions that I have > heard over the years about how 8-byte varlena headers would kill > performance, 8-byte Oids would kill performance, etc, turned out to > be quite inaccurate. Well, it can increase the size of the database, turning a system where the hot set fits into memory into one where it doesn't anymore. But really, the performance concerns were more about the catalog lookups. Fundamentally, I think there's nothing I see preventing such a scheme from being implemented - but I think there's about zap chance of it ever getting integrated, it's just far to invasive with very high costs in scenarios where it's not used for not all that much gain. Not to speak about the amount of engineering it would require to implement. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: