Thread: Postgres-R: primary key patches
Hi, as you might know, Postgres-R relies on primary keys to address tuples of a table. It cannot replicate tables without a primary key. Primary keys currently aren't really used within the executor, so I had to extended and modify Postgres here and there, to get the required information. To ease reviewing I have split out these modifications and present them here as two separate little patches. The first one, get_pkey_index_oid.diff, changes the function relationHasPrimaryKey into GetPrimaryKeyIndexOid, which now returns an index oid instead of just a boolean. It works pretty much the same, except from returning an oid instead of just a boolean. (In the current Postgres-R code, I've duplicated that code to src/backend/replication/recovery.c) And secondly, the add_pkey_info.diff patch adds a boolean field ii_Primary to the IndexInfo struct and ri_PrimaryKey to the ResultRelInfo struct, which is an index into the indexInfoArray. I think these are relatively trivial modifications which could be helpful for other purposes as well. So I suggest to apply them to mainline whenever appropriate (read: choose the appropriate commit fest). This also raises the more general question of how to start collaborating on Postgres-R. I realize that it's a pretty huge project. However, I'm unsure on how to ease reviewing for others, so if you have any ideas or questions, please don't hesitate to ask. Regards Markus ============================================================ *** src/backend/commands/indexcmds.c 61a8b3774b682554e8670624583ab4cf4b9dbdb9 --- src/backend/commands/indexcmds.c dc6fc2a3fbce90748bcf4cd7a60ea2ea887bc97f *************** static Oid GetIndexOpClass(List *opclass *** 64,70 **** bool isconstraint); static Oid GetIndexOpClass(List *opclass, Oid attrType, char *accessMethodName, Oid accessMethodId); - static bool relationHasPrimaryKey(Relation rel); /* --- 64,69 ---- *************** DefineIndex(RangeVar *heapRelation, *** 324,330 **** * it's no problem either. */ if (is_alter_table && ! relationHasPrimaryKey(rel)) { ereport(ERROR, (errcode(ERRCODE_INVALID_TABLE_DEFINITION), --- 323,329 ---- * it's no problem either. */ if (is_alter_table && ! (GetPrimaryKeyIndexOid(rel) != InvalidOid)) { ereport(ERROR, (errcode(ERRCODE_INVALID_TABLE_DEFINITION), *************** ChooseRelationName(const char *name1, co *** 1216,1229 **** } /* ! * relationHasPrimaryKey - * ! * See whether an existing relation has a primary key. */ ! static bool ! relationHasPrimaryKey(Relation rel) { ! bool result = false; List *indexoidlist; ListCell *indexoidscan; --- 1215,1229 ---- } /* ! * GetPrimaryKeyIndexOid * ! * Returns the oid of the primary key index of the relation, if any, ! * otherwise InvalidOid is returned. */ ! Oid ! GetPrimaryKeyIndexOid(Relation rel) { ! Oid result = InvalidOid; List *indexoidlist; ListCell *indexoidscan; *************** relationHasPrimaryKey(Relation rel) *** 1244,1257 **** 0, 0, 0); if (!HeapTupleIsValid(indexTuple)) /* should not happen */ elog(ERROR, "cache lookup failed for index %u", indexoid); ! result = ((Form_pg_index) GETSTRUCT(indexTuple))->indisprimary; ReleaseSysCache(indexTuple); ! if (result) break; } list_free(indexoidlist); - return result; } --- 1244,1260 ---- 0, 0, 0); if (!HeapTupleIsValid(indexTuple)) /* should not happen */ elog(ERROR, "cache lookup failed for index %u", indexoid); ! ! if (((Form_pg_index) GETSTRUCT(indexTuple))->indisprimary) ! result = indexoid; ! ReleaseSysCache(indexTuple); ! ! if (result != InvalidOid) break; } list_free(indexoidlist); return result; } ============================================================ *** src/include/commands/defrem.h e2384af33d917bff68234bbe407ea16e3ec43123 --- src/include/commands/defrem.h 58bb763402c9bef8ead035a3524505ad8fe58de5 *************** *** 15,22 **** #define DEFREM_H #include "nodes/parsenodes.h" - /* commands/indexcmds.c */ extern void DefineIndex(RangeVar *heapRelation, char *indexRelationName, --- 15,22 ---- #define DEFREM_H #include "nodes/parsenodes.h" + #include "utils/relcache.h" /* commands/indexcmds.c */ extern void DefineIndex(RangeVar *heapRelation, char *indexRelationName, *************** extern Oid GetDefaultOpClass(Oid type_id *** 43,48 **** --- 43,49 ---- extern char *ChooseRelationName(const char *name1, const char *name2, const char *label, Oid namespace); extern Oid GetDefaultOpClass(Oid type_id, Oid am_id); + extern Oid GetPrimaryKeyIndexOid(Relation rel); /* commands/functioncmds.c */ extern void CreateFunction(CreateFunctionStmt *stmt); ============================================================ *** src/backend/catalog/index.c c360fcfd1002ffa557c1a376d3e74c9c2a0924db --- src/backend/catalog/index.c 7201f06c5c1ad213a6acb6b694b666dd38358234 *************** BuildIndexInfo(Relation index) *** 999,1004 **** --- 999,1005 ---- /* other info */ ii->ii_Unique = indexStruct->indisunique; + ii->ii_Primary = indexStruct->indisprimary; ii->ii_ReadyForInserts = indexStruct->indisready; /* initialize index-build state to default */ ============================================================ *** src/backend/executor/execUtils.c 54719433b61db70e6b433cd165c9c7a7b15e6531 --- src/backend/executor/execUtils.c 62756367e3fb34d96e0bb8c1bff4ee25f6402a4a *************** ExecOpenIndices(ResultRelInfo *resultRel *** 935,940 **** --- 935,943 ---- /* extract index key information from the index's pg_index info */ ii = BuildIndexInfo(indexDesc); + if (ii->ii_Primary) + resultRelInfo->ri_PrimaryKey = i; + relationDescs[i] = indexDesc; indexInfoArray[i] = ii; i++; ============================================================ *** src/include/nodes/execnodes.h 9c75d10763d7bedc0a4db62a319d31e1549ad542 --- src/include/nodes/execnodes.h 3bc5ba3ad42bfdb298878f69ea3538fa96ba0815 *************** typedef struct IndexInfo *** 59,64 **** --- 59,65 ---- List *ii_Predicate; /* list of Expr */ List *ii_PredicateState; /* list of ExprState */ bool ii_Unique; + bool ii_Primary; bool ii_ReadyForInserts; bool ii_Concurrent; bool ii_BrokenHotChain; *************** typedef struct ResultRelInfo *** 289,294 **** --- 290,296 ---- Index ri_RangeTableIndex; Relation ri_RelationDesc; int ri_NumIndices; + int ri_PrimaryKey; RelationPtr ri_IndexRelationDescs; IndexInfo **ri_IndexRelationInfo; TriggerDesc *ri_TrigDesc;
markus@bluegap.ch (Markus Wanner) writes: > as you might know, Postgres-R relies on primary keys to address tuples > of a table. It cannot replicate tables without a primary key. Slony-I does the same, with the "variation" that it permits the option of using a "candidate primary key," namely an index that is unique+NOT NULL. If it is possible to support that broader notion, that might make addition of these sorts of logic more widely useful. > Primary keys currently aren't really used within the executor, so I > had to extended and modify Postgres here and there, to get the > required information. To ease reviewing I have split out these > modifications and present them here as two separate little patches. I know Jan Wieck has in mind the idea of adding an interface to enable doing highly efficient IUD (Insert/Update/Delete) via generating a way to do direct heap updates, which would be *enormously* more efficient than the present need (in Slony-I, for instance) to parse, plan and execute thousands of IUD statements. For UPDATE/DELETE to work requires utilizing (candidate) primary keys, so there is some seemingly relevant similarity there. -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. "My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code." <http://www.eviloverlord.com/>
Hello Chris, chris wrote: > Slony-I does the same, with the "variation" that it permits the option > of using a "candidate primary key," namely an index that is unique+NOT > NULL. > > If it is possible to support that broader notion, that might make > addition of these sorts of logic more widely useful. Well, yeah, that's technically not much different, so it would probably be very easy to extend Postgres-R to work on any arbitrary Index. But what do we have primary keys for, in the first place? Isn't it exactly the *primay* key into the table, which you want to use for replication? Or do we need an additional per-table configuration option for that? A REPLICATION KEY besides the PRIMARY KEY? > I know Jan Wieck has in mind the idea of adding an interface to enable > doing highly efficient IUD (Insert/Update/Delete) via generating a way > to do direct heap updates, which would be *enormously* more efficient > than the present need (in Slony-I, for instance) to parse, plan and > execute thousands of IUD statements. For UPDATE/DELETE to work > requires utilizing (candidate) primary keys, so there is some > seemingly relevant similarity there. Definitely. The remote backend does exactly that for Postgres-R: it takes a change set, which consists of one or more tuple collections, and then applies these collections. See ExecProcessCollection() in execMain.c. (Although, I'm still less than thrilled about the internal storage format of these tuple collections. That can certainly be improved and simplified.) Regards Markus
On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote: > Hello Chris, > > chris wrote: >> Slony-I does the same, with the "variation" that it permits the >> option of using a "candidate primary key," namely an index that is >> unique+NOT NULL. >> >> If it is possible to support that broader notion, that might make >> addition of these sorts of logic more widely useful. > > Well, yeah, that's technically not much different, so it would > probably be very easy to extend Postgres-R to work on any arbitrary > Index. > > But what do we have primary keys for, in the first place? Isn't it > exactly the *primay* key into the table, which you want to use for > replication? Or do we need an additional per-table configuration > option for that? A REPLICATION KEY besides the PRIMARY KEY? We have them because people are used to thinking in terms of a "PRIMARY KEY," not because that concept is actually distinguishable from a non-partial UNIQUE NOT NULL constraint. While I'm a "chicken" rather than a "pig" on this project <http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig>, I believe that covering the more general case right from the start would be a much better plan. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote: >> But what do we have primary keys for, in the first place? > We have them because people are used to thinking in terms of a > "PRIMARY KEY," not because that concept is actually distinguishable > from a non-partial UNIQUE NOT NULL constraint. No, we have them because the SQL standard actually assigns a distinct meaning to a primary key. (It's the default foreign key reference column(s) for the table --- and in that context it's clear that There Can Be Only One.) regards, tom lane
"David Fetter" <david@fetter.org> writes: > On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote: >> Hello Chris, >> >> chris wrote: >>> Slony-I does the same, with the "variation" that it permits the >>> option of using a "candidate primary key," namely an index that is >>> unique+NOT NULL. >>> >>> If it is possible to support that broader notion, that might make >>> addition of these sorts of logic more widely useful. >> >> Well, yeah, that's technically not much different, so it would >> probably be very easy to extend Postgres-R to work on any arbitrary >> Index. >> >> But what do we have primary keys for, in the first place? Isn't it >> exactly the *primay* key into the table, which you want to use for >> replication? Or do we need an additional per-table configuration >> option for that? A REPLICATION KEY besides the PRIMARY KEY? Hm, it occurs to me that really Slony should be saying WHERE (col1,col2,...) = ('x','y','z',...) and letting the server figure out what access method is best for finding the candidate record. That could mean using the primary key index, or it could mean using some other index (perhaps a partial index for example). It would be nice if there was a way for Slony to express to the server that really, it only needs any UNIQUE NOT NULL combination of columns to match. Once the server has any such combination which matches it can skip checking the rest. I can't think of any way to write such a query in SQL. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
Hi, David Fetter wrote: > While I'm a "chicken" rather than a "pig" on this project > <http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig>, I believe that > covering the more general case right from the start would be a much > better plan. I was trying to say that Postgres-R internally relies only on a unique index with not null constraint. It doesn't care if you name it PRIMARY KEY or REPLICATION KEY or whatever. So, it's just a question of the syntax. We already have PRIMARY KEYs, and those are pretty much what I think is needed in 99% of all cases as the pointer to the replication While I'm normally an absolute fan of generality, I think you didn't quite get the point.
Hi, sorry, some strange key-combination made my mail client send too early... I myself wrote: > I was trying to say that Postgres-R internally relies only on a unique > index with not null constraint. It doesn't care if you name it PRIMARY > KEY or REPLICATION KEY or whatever. > > So, it's just a question of the syntax. We already have PRIMARY KEYs, > and those are pretty much what I think is needed in 99% of all cases as > the pointer to the replication .. as the pointer to the index to use for replication. Offering the user a possibility to choose another (index + not null) would require something like ALTER TABLE ... ADD REPLICATION KEY ... or some such. Mostly syntactic sugar, which can be added as soon as we really need it. > While I'm normally an absolute fan of generality, > > I think you didn't quite get the point. Iiik.. that's what I didn't want to send and wanted to delete before sending... :-) Sorry. Regards Markus
Hi, Tom Lane wrote: > It's the default foreign key reference > column(s) for the table That's why I think it makes for a pretty good replication key as well. Regards Markus
Hi, I realize that you are talk about Slony, let me answer for the Postgres-R case, anyway. Gregory Stark wrote: > Hm, it occurs to me that really Slony should be saying > WHERE (col1,col2,...) = ('x','y','z',...) Hm.. that would mean increasing the amount of work for the remote backend, which applies remote transaction. For scalability reasons, I'm trying to keep that minimal. > and letting the server figure out what access method is best for finding the > candidate record. That could mean using the primary key index, or it could > mean using some other index (perhaps a partial index for example). For Postgres-R, I think that would only be a gain in those cases, where all tuples of a collection (or even the entire change set) only affect tuples from a partial index. That doesn't look like it's worth the trouble, IMO. Or do you think that's a frequent case? Thinking about it, I'd even say that requiring only one index frequently is favorable because of caching effects. Dunno. > It would be nice if there was a way for Slony to express to the server that > really, it only needs any UNIQUE NOT NULL combination of columns to match. > Once the server has any such combination which matches it can skip checking > the rest. I can't think of any way to write such a query in SQL. I don't quite get your point here. For UPDATEs which change the PRIMARY KEY, the sender currently sends the *old* values plus the changes. In that case, you certainly don't want to send the entire olde tuple, but only the fields for *one* KEY. That's what I'm calling the replication key. (And currently equals the PRIMARY KEY). Maybe I'm thinking too much in terms of Postgres-R, instead of Slony, what you are talking about. Regards Markus
Markus Wanner wrote: > Gregory Stark wrote: >> It would be nice if there was a way for Slony to express to the server that >> really, it only needs any UNIQUE NOT NULL combination of columns to match. >> Once the server has any such combination which matches it can skip checking >> the rest. I can't think of any way to write such a query in SQL. > > I don't quite get your point here. For UPDATEs which change the PRIMARY > KEY, the sender currently sends the *old* values plus the changes. In > that case, you certainly don't want to send the entire olde tuple, but > only the fields for *one* KEY. That's what I'm calling the replication > key. (And currently equals the PRIMARY KEY). I think the point here is that you need to distinguish which tuple you need to update. For this, our Replicator uses the primary key only; there's no way to use another candidate key (unique not null). It would certainly be possible to use a different candidate key, but as far as I know no customer has ever requested this. (FWIW we don't send the old values -- only the original PK columns, the values of columns that changed, and the "update mask" in terms of heap_modify_tuple.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi, Alvaro Herrera wrote: > I think the point here is that you need to distinguish which tuple you > need to update. For this, our Replicator uses the primary key only; > there's no way to use another candidate key (unique not null). It would > certainly be possible to use a different candidate key, Yeah, and for this to work, the *sender* needs to decide on a key to use. > but as far as I > know no customer has ever requested this. I can't see the use case for a separate REPLICATION KEY, different from the PRIMARY KEY, either.. > (FWIW we don't send the old values -- only the original PK columns, the > values of columns that changed, and the "update mask" in terms of > heap_modify_tuple.) Yup, that's pretty much the same what I'm doing for Postgres-R. Regards Markus
markus@bluegap.ch (Markus Wanner) writes: > Hello Chris, > > chris wrote: >> Slony-I does the same, with the "variation" that it permits the option >> of using a "candidate primary key," namely an index that is unique+NOT >> NULL. >> >> If it is possible to support that broader notion, that might make >> addition of these sorts of logic more widely useful. > > Well, yeah, that's technically not much different, so it would > probably be very easy to extend Postgres-R to work on any arbitrary > Index. > > But what do we have primary keys for, in the first place? Isn't it > exactly the *primay* key into the table, which you want to use for > replication? Or do we need an additional per-table configuration > option for that? A REPLICATION KEY besides the PRIMARY KEY? I agree with you that tables are *supposed* to have primary keys; that's proper design, and if tables are missing them, then something is definitely broken. Sometimes, unfortunately, people make errors in design, and we wind up needing to accomodate situations that are "less than perfect." The "happy happenstance" is that, in modern versions of PostgreSQL, a unique index may be added in the background so that this may be rectified without outage if you can live with a "candidate primary key" rather than a true PRIMARY KEY. It seems to me that this extension can cover over a number of "design sins," which looks like a very kind accomodation where it is surely preferable to design it in earlier rather than later. >> I know Jan Wieck has in mind the idea of adding an interface to enable >> doing highly efficient IUD (Insert/Update/Delete) via generating a way >> to do direct heap updates, which would be *enormously* more efficient >> than the present need (in Slony-I, for instance) to parse, plan and >> execute thousands of IUD statements. For UPDATE/DELETE to work >> requires utilizing (candidate) primary keys, so there is some >> seemingly relevant similarity there. > > Definitely. The remote backend does exactly that for Postgres-R: it > takes a change set, which consists of one or more tuple collections, > and then applies these collections. See ExecProcessCollection() in > execMain.c. > > (Although, I'm still less than thrilled about the internal storage > format of these tuple collections. That can certainly be improved and > simplified.) You may want to have a chat with Jan; he's got some thoughts on a more general purpose mechanism that would be good for this as well as for (we think) extremely efficient bulk data loading. -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. "My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code." <http://www.eviloverlord.com/>
Hi, chris wrote: > I agree with you that tables are *supposed* to have primary keys; > that's proper design, and if tables are missing them, then something > is definitely broken. Ah, I see, so you are not concerned about tables with a PRIMARY KEY for which one wants another REPLICATION KEY, but only about tables without a PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the first place. However, that's a general limitation of replication at tuple level: you need to be able to uniquely identify tuples. (Unlike replication on storage level, which can use the storage location for that). > Sometimes, unfortunately, people make errors in design, and we wind up > needing to accomodate situations that are "less than perfect." > > The "happy happenstance" is that, in modern versions of PostgreSQL, a > unique index may be added in the background so that this may be > rectified without outage if you can live with a "candidate primary > key" rather than a true PRIMARY KEY. I cannot see any reason for not wanting a PRIMARY KEY, but wanting replication, and therefore a REPLICATION KEY. Or are you saying we should add a hidden REPLICATION KEY for people who are afraid of schema changes and dislike a visible primary key? Would you want to hide the underlying index as well? > It seems to me that this extension can cover over a number of "design > sins," which looks like a very kind accomodation where it is surely > preferable to design it in earlier rather than later. Sorry, but I fail to see any real advantage of that covering of "sins". I would find it rather confusing to have keys and indices hidden from the admin. It's not like an additional index or a primary key would lead to functional changes. That's certainly different for additional columns, where a SELECT * could all of a sudden return more columns than before. So that's the exception where I agree that hiding such an additional column like we already do for system columns would make sense. That's for example the situation where you add an 'id' column later on and make that the new primary (and thus replication) key. Maybe that's what you meant? However, even in that case, I wouldn't hide the index nor the primary key, but only the column. Regards Markus
Hi, chris wrote: > You may want to have a chat with Jan; he's got some thoughts on a more > general purpose mechanism that would be good for this as well as for > (we think) extremely efficient bulk data loading. Jan, mind to share your thoughts? What use cases for such a general purpose mechanism do you see? What I can imagine doing on top of Postgres-R is: splitting up the data and feeding multiple backends with it. Unlike Postgres-R's internal use, you'd still have to check the data against constraints, I think. It would involve the origin backend asking for help from the manager. That one checks for available helper backends and then serves as a message dispatcher between the origin and helper backends (as it does for replication purposes). Please note that it already uses shared memory extensively, so the manager doesn't need to copy around the data itself. Regards Markus
Markus Wanner wrote: > (Although, I'm still less than thrilled about the internal storage > format of these tuple collections. That can certainly be improved and > simplified.) Care to expand more on what it is? On Replicator we're using the binary send/recv routines to transmit tuples. (Obviously this fails when the master and slave have differing binary output, but currently we just punt on this point). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Hi, Alvaro Herrera wrote: > Markus Wanner wrote: >> (Although, I'm still less than thrilled about the internal storage >> format of these tuple collections. That can certainly be improved and >> simplified.) > > Care to expand more on what it is? Well, what I really dislike is the overhead in code to first transform tuples into a string based internal change set representation, which then gets serialized again. That looks like two conversion steps, which are both prone to error. I'm about to merge those into a simpler tuple serializer, which shares code with the initializer (or recovery provider/subscriber) part. This is where I'd like to know what requirements Jan or others have. I will try to outline the current implementation and requirements of Postgres-R in a new thread soon. > On Replicator we're using the binary > send/recv routines to transmit tuples. (Obviously this fails when the > master and slave have differing binary output, but currently we just > punt on this point). Yeah, that's another point. I'm currently using the textual input/output functions, but would like to switch to the binary one as well. However, that's an optimization, where the above is simplification of code, which is more important to me at the moment. Regards Markus
markus@bluegap.ch (Markus Wanner) writes: > chris wrote: >> I agree with you that tables are *supposed* to have primary keys; >> that's proper design, and if tables are missing them, then something >> is definitely broken. > > Ah, I see, so you are not concerned about tables with a PRIMARY KEY > for which one wants another REPLICATION KEY, but only about tables > without a PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the > first place. "Doesn't want" is probably overstating the matter. I'll describe a scenario to suggest where it might happen. - A system is implemented, using the database, and, for some reason, no PRIMARY KEY is defined for a table. Someone forgot;it got misconfigured; a mistake was probably made. - The system then goes into production, and runs for a while. The table has data added to it, and starts to grow ratherlarge. - At THIS point, we decide to introduce replication, only to discover that there isn't a PRIMARY KEY on the table. Ideally, we'd take an outage and add the primary key. But suppose we can't afford to do so? The "add indexes concurrently" added in 8.3 (if memory serves) *would* allow us to create a *candidate* primary key without forcing an outage. In theory, we'd like to have a true primary key. Sometimes operational issues get in the way. > However, that's a general limitation of replication at tuple level: > you need to be able to uniquely identify tuples. (Unlike replication > on storage level, which can use the storage location for that). No disagreement; yes, we certainly do need a way to uniquely identify tuples, otherwise we can't replicate UPDATE or DELETE. >> Sometimes, unfortunately, people make errors in design, and we wind up >> needing to accomodate situations that are "less than perfect." >> >> The "happy happenstance" is that, in modern versions of PostgreSQL, a >> unique index may be added in the background so that this may be >> rectified without outage if you can live with a "candidate primary >> key" rather than a true PRIMARY KEY. > > I cannot see any reason for not wanting a PRIMARY KEY, but wanting > replication, and therefore a REPLICATION KEY. > > Or are you saying we should add a hidden REPLICATION KEY for people > who are afraid of schema changes and dislike a visible primary key? > Would you want to hide the underlying index as well? The scenario I outline above hopefully answers this. It's not a matter that I expect people to specifically desire not to have a primary key. Instead, I expect cases where mistakes compound with operational issues to make them say "Ow - I can't do that now!" -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. "My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code." <http://www.eviloverlord.com/>
Hi, chris wrote: > I'll describe a scenario to suggest where it might happen. > > - A system is implemented, using the database, and, for some reason, > no PRIMARY KEY is defined for a table. Someone forgot; it got > misconfigured; a mistake was probably made. > > - The system then goes into production, and runs for a while. The > table has data added to it, and starts to grow rather large. > > - At THIS point, we decide to introduce replication, only to discover > that there isn't a PRIMARY KEY on the table. Yeah, that's the situation I had in mind as well. > Ideally, we'd take an outage and add the primary key. But suppose we > can't afford to do so? You are assuming that one doesn't need to take an outage to start replication in the first place. As Postgres-R comes with system catalog changes, that's not the case. You will at least need to restart the postmaster, without some sort of system catalog upgrading (which doesn't currently exists) you even need a full dump/restore cycle. > The "add indexes concurrently" added in 8.3 (if memory serves) *would* > allow us to create a *candidate* primary key without forcing an > outage. Postgres-R is primarily being developed for *future* versions of Postgres, I don't see any point in back porting something that is not complete for the current version, yet. >> However, that's a general limitation of replication at tuple level: >> you need to be able to uniquely identify tuples. (Unlike replication >> on storage level, which can use the storage location for that). > > No disagreement; yes, we certainly do need a way to uniquely identify > tuples, otherwise we can't replicate UPDATE or DELETE. Yup, that's the real issue here. > The scenario I outline above hopefully answers this. I see the problem of wanting to replicate tables which didn't have a PRIMARY KEY before. But I still cannot see a use case for hiding indices or keys. > It's not a > matter that I expect people to specifically desire not to have a > primary key. Instead, I expect cases where mistakes compound with > operational issues to make them say "Ow - I can't do that now!" Yeah, these issues certainly need to be addressed. I think the ability to add a hidden column and a (visible!) primary key on that column should help in that case. Thinking about index creation time doesn't make sense, as long as we still need a dump/restore cycle to setup replication. And even then, that operational issue has nothing to do with the question of hiding the newly generated index or not. Regards Markus
Markus Wanner <markus@bluegap.ch> writes: > Thinking about index creation time doesn't make sense, as long as we > still need a dump/restore cycle to setup replication. And even then, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > that operational issue has nothing to do with the question of hiding > the newly generated index or not. Let me note that one of the design criteria for Slony-I was to explicitly NOT have such a requirement. Making the assumption that it *is* acceptable to disrupt operations for the duration of a dump/restore cycle is certain to limit interest in a replication system. A most pointed case where that will cause heartburn of the "I refuse to use this" sort is if that disruption needs to take place when recovering from the failure of a node. That sort of disruption is certainly counterproductive to the usual goal of replication enhancing system availability. Maybe I am misreading you; I rather hope so. -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. "My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code." <http://www.eviloverlord.com/>
Hi, Christopher Browne wrote: > Markus Wanner <markus@bluegap.ch> writes: >> Thinking about index creation time doesn't make sense, as long as we >> still need a dump/restore cycle to setup replication. And even then, > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> that operational issue has nothing to do with the question of hiding >> the newly generated index or not. > > Let me note that one of the design criteria for Slony-I was to > explicitly NOT have such a requirement. That's a pretty cool feature, but hasn't been one of the primary design goal of Postgres-R. > Making the assumption that it *is* acceptable to disrupt operations > for the duration of a dump/restore cycle is certain to limit interest > in a replication system. I agree, that's certainly true. > A most pointed case where that will cause heartburn of the "I refuse > to use this" sort is if that disruption needs to take place when > recovering from the failure of a node. That sort of disruption is > certainly counterproductive to the usual goal of replication enhancing > system availability. Huh? What does migration between major Postgres versions have to do with node failures or recoveries? System availability certainly *is* one of the primary design goals of Posgres-R. Thus, once installed and running, you certainly don't need any such procedure again. Certainly not due to node failures. It would be nice if future upgrades (i.e. major version upgrades) of single nodes could be done while the rest of the cluster is running. That would mean having a pretty static binary change set communication protocol, which works between different major Postgres versions. That's certainly planned, but hey, we don't have a production ready version for *any* major version, yet. > Maybe I am misreading you; I rather hope so. With an additional process and schema changes, Postgres-R takes quite a different approach than Slony. I don't think that would have been possible without forcing at least a Postmaster restart. The schema changes are pretty minimal and can probably be done manually (well, script driven, perhaps) before restarting with a Postmaster which has replication compiled in. That would save the dump/restore cycle, but certainly not the Postmaster restart. However, with regard to the catalog version, Postgres-R can be thought of as another major version of Postgres. (Maybe I should even extend the catalog version with an 'R' or something, so as to prevent normal Postgres version from running on a data directory of a Postgres-R installation). Regards Markus
Le mardi 22 juillet 2008, Christopher Browne a écrit : > A most pointed case where that will cause heartburn of the "I refuse > to use this" sort is if that disruption needs to take place when > recovering from the failure of a node. That sort of disruption is > certainly counterproductive to the usual goal of replication enhancing > system availability. > > Maybe I am misreading you; I rather hope so. This part of Markus's mail makes me think the need may change if Postgres-R is ever integrated into -core: Le mardi 22 juillet 2008, Markus Wanner a écrit : > As Postgres-R comes with system catalog changes, that's not the case. So currently to use Postgres-R you'd have to start with a patched code base at each and every node, because it's how Markus wanted to proceed (Postgres-R being a separated code base). In Postgres-R adding a node to the cluster is what is done without dump/restore cycle. Now that he's Open-Sourcing the solution, I hope to see this mode of operation change, thanks to integration of some key part (catalog changes) of the project into -core, if possible. Note that while slony doesn't require a dump/restore to get activated, it seems to me (as a non user of it) that it still plays with catalog, preventing "normal" usage of pg_dump... I'm not sure which disease I prefer: not being able to dump/restore normally or getting to have to restore on a specific product version, not the -core one. Just my 2 cents, hoping I'm understanding correctly the point at hand here, -- dim
Markus Wanner wrote: >> Ideally, we'd take an outage and add the primary key. But suppose we >> can't afford to do so? > > You are assuming that one doesn't need to take an outage to start > replication in the first place. As Postgres-R comes with system catalog > changes, that's not the case. You will at least need to restart the > postmaster, without some sort of system catalog upgrading (which doesn't > currently exists) you even need a full dump/restore cycle. Hey, for Replicator I wrote a bootstrapping system for "catalog upgrading" -- it starts a special "bootstrap mode" and allows creating new catalogs, their indexes, and a bunch of functions. Afterwards everything is considered "internal". It's quite hackish but it works ... -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Hi, Dimitri Fontaine wrote: > This part of Markus's mail makes me think the need may change if Postgres-R is > ever integrated into -core: Yes, in that case, you'd have replication already compiled in and distributed with standard Postgres. However, ATM that's pipe dreaming and I'm pretty sure no developer (neither me nor Postgres hackers) want to mix code (and responsibility!) at this stage of development of Postgres-R. The most I'd be willing to ask for at the moment would be to get a range of OIDs reserved for use in Postgres-R. It would not make sense at the moment to add the schema changes to stardard Postgres, because I will pretty have to change these again. > So currently to use Postgres-R you'd have to start with a patched code base at > each and every node, because it's how Markus wanted to proceed (Postgres-R > being a separated code base). In Postgres-R adding a node to the cluster is > what is done without dump/restore cycle. Yup. > Now that he's Open-Sourcing the solution, I hope to see this mode of operation > change, thanks to integration of some key part (catalog changes) of the > project into -core, if possible. Sorry, but at the moment, I disagree, because I think this would complicate matters for both projects. This might (and hopefully will) change, sure. But we are not there, yet. > Note that while slony doesn't require a dump/restore to get activated, it > seems to me (as a non user of it) that it still plays with catalog, > preventing "normal" usage of pg_dump... Oh, does it? Well, it obviously doesn't require a Postmaster restart, nor does it add a separate background process. > I'm not sure which disease I prefer: not being able to dump/restore normally > or getting to have to restore on a specific product version, not the -core > one. I think this process of moving between ordinary Postgres and Postgres-R schema variants for the same(!) major version can be automated. It would be a pretty small pg_upgrade sort of tool. I'm not that afraid of these schema changes. Heck, in the worst case, we could even let Postgres-R add them itself during startup. Sorry if this sounds a little rude. I've just had the 'why isn't Postgres-R integrated?' discussion a little too often. Regards Markus Wanner
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Note that while slony doesn't require a dump/restore to get activated, it > seems to me (as a non user of it) that it still plays with catalog, > preventing "normal" usage of pg_dump... FYI, that will no longer be the case in version 2.0 of Slony-I; with the changes made in 8.3, it is no longer necessary to play with the catalog in the unclean ways that have traditionally made pg_dump "break." -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. "My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code." <http://www.eviloverlord.com/>
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Note that while slony doesn't require a dump/restore to get activated, it > seems to me (as a non user of it) that it still plays with catalog, > preventing "normal" usage of pg_dump... As of 8.3 there are some new trigger features in core that were put there for Slony. I'm not sure to what extent that will let them get rid of making nonstandard catalog changes ... perhaps Chris or Jan can explain. regards, tom lane