fixing LISTEN/NOTIFY - Mailing list pgsql-hackers
From | Neil Conway |
---|---|
Subject | fixing LISTEN/NOTIFY |
Date | |
Msg-id | 1128573686.9140.44.camel@localhost.localdomain Whole thread Raw |
Responses |
Re: fixing LISTEN/NOTIFY
Re: fixing LISTEN/NOTIFY |
List | pgsql-hackers |
Applications that frequently use LISTEN/NOTIFY can suffer from performance problems because of the MVCC bloat created by frequent insertions into pg_listener. A solution to this has been suggested in the past: rewrite LISTEN/NOTIFY to use shared memory rather than system catalogs. The problem is that there is a static amount of shared memory and a potentially unbounded number of notifications, so we can run out of memory. There are two ways to solve this: we can do as sinval does and clear the shared memory queue, then effectively issue a NOTIFY ALL that awakens all listeners. I don't like this behaviour: it seems ugly to expose an implementation detail (static sizing of shared memory) to applications. While a lot of applications are only using LISTEN/NOTIFY for cache invalidation (and so spurious notifications are just a performance hit), this behaviour still seems unfortunate to me. Using NOTIFY ALL also makes NOTIFY 'msg' far less useful, which is a feature several users have asked for in the past. I think it would be better to either fail the NOTIFY when there is not enough shared memory to add a new notification to the queue, or have the NOTIFY block until shared memory does become available (applications could of course implement the latter on top of the former by using savepoints and a loop, either on the client-side or in PL/PgSQL). I guess we could add an option to NOTIFY to specify how to handle failures. A related question is when to add the notification to the shared memory queue. We don't want the notification to fire until the NOTIFY's transaction commits, so one alternative would be to delay appending to the queue until transaction-commit time. However, that would mean we wouldn't notice NOTIFY failure until the end of the transaction, or else that we would block waiting for free space during the transaction-commit process. I think it would be better to add an entry to shared memory during the NOTIFY itself, and stamp that entry with the NOTIFY's toplevel XID. Other backends can read that the notification immediately (and once all the backends have seen it, the notification can be removed from the queue). Each backend can use the XID to determine when to "fire" the notification (and if the notifying backend rolls back, they can just discard the notification). This scheme is more expensive when the notifying transaction rolls back, but I don't think that is the common case. Comments? (I'm still thinking about how to organize the shared memory queue, and whether any of the sinval stuff can be reused...) -Neil
pgsql-hackers by date: