Re: Proposal: Snapshot cloning - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: Proposal: Snapshot cloning |
Date | |
Msg-id | 1169795956.3368.8.camel@localhost.localdomain Whole thread Raw |
In response to | Proposal: Snapshot cloning (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: Proposal: Snapshot cloning
|
List | pgsql-hackers |
Ühel kenal päeval, N, 2007-01-25 kell 22:19, kirjutas Jan Wieck: > Granted this one has a few open ends so far and I'd like to receive some > constructive input on how to actually implement it. > > The idea is to clone an existing serializable transactions snapshot > visibility information from one backend to another. The semantics would > be like this: > > backend1: start transaction; > backend1: set transaction isolation level serializable; > backend1: select pg_backend_pid(); > backend1: select publish_snapshot(); -- will block > > backend2: start transaction; > backend2: set transaction isolation level serializable; > backend2: select clone_snapshot(<pid>); -- will unblock backend1 > > backend1: select publish_snapshot(); > > backend3: start transaction; > backend3: set transaction isolation level serializable; > backend3: select clone_snapshot(<pid>); > > ... > > This will allow a number of separate backends to assume the same MVCC > visibility, so that they can query independent but the overall result > will be according to one consistent snapshot of the database. I see uses for this in implementing query parallelism in user level code, like querying two child tables in two separate processes. > What I try to accomplish with this is to widen a bottleneck, many > current Slony users are facing. The initial copy of a database is > currently limited to one single reader to copy a snapshot of the data > provider. With the above functionality, several tables could be copied > in parallel by different client threads, feeding separate backends on > the receiving side at the same time. I'm afraid that for most configurations this would make the copy slower, as there will be mode random disk i/o. Maybe better fix slony so that it allows initial copies in different parallel transactions, or just do initial copy in several sets and merge the sets later. > The feature could also be used by a parallel version of pg_dump as well > as data mining tools. > > The cloning process needs to make sure that the clone_snapshot() call is > made from the same DB user in the same database as corresponding > publish_snapshot() call was done. Why ? Snapshot is universal and same for whole db instance, so why limit it to same user/database ? > Since publish_snapshot() only > publishes the information, it gained legally and that is visible in the > PGPROC shared memory (xmin, xmax being the crucial part here), there is > no risk of creating a snapshot for which data might have been removed by > vacuum already. > > What I am not sure about yet is what IPC method would best suit the > transfer of the arbitrarily sized xip vector. Ideas? > > > Jan > -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
pgsql-hackers by date: