Hi, Andres!
There is my review of this patch.
1) Patches don't apply cleanly to head. So I used commit bed88fceac04042f0105eb22a018a4f91d64400d as the base for patches, then all the patches close to this apply cleanly. Regression tests pass OK, but it seems that new functionality isn't covered by regression tests.
2) Patch needs more comments. I think we need at least one comment in head of each function describing its behaviour, even if it is evident from function name.
4) There is significant code duplication in APPLY_CACHE_CHANGE_UPDATE and APPLY_CACHE_CHANGE_DELETE branches of case in apply_change function. I think this could be refactored to reduce code duplication.
5) Apply mechanism requires PK from each table. So, throwing error here if we don't find PK is necessary. But we need to prevent user from run logical replication earlier than failing applying received messages. AFACS patch which is creating corresponding log messages is here:
http://archives.postgresql.org/message-id/1339586927-13156-7-git-send-email-andres@2ndquadrant.com. And it throws any warning if it fails to find PK. On which stage we prevent user from running logical replication on tables which doesn't have PK?
6) I've seen comment /* FIXME: locking */. But you open index with command
index_rel = index_open(indexoid, AccessShareLock);
and close it with command
heap_close(index_rel, NoLock);
Shouldn't we use same level of locking on close as on open? Also, heap_close doesn't looks good to me for closing index. Why don't use index_close or relation_close?
7) We find each updated and deleted tuple by PK. Imagine we update significant part of the table (for example, 10%) in single query and planner choose sequential scan for it. Then applying of this changes could be more expensive than doing original changes. This it probably ok. But, we could do some heuristics: detect that sequential scan is cheaper because of large amount of updates or deletes in one table.
------
With best regards,
Alexander Korotkov.