Thread: intercepting WAL writes
<div class="Section1"><p class="MsoNormal"><span style="font-size:10.0pt">Hello,</span><p class="MsoNormal"><span style="font-size:10.0pt"> </span><pclass="MsoNormal"><span style="font-size:10.0pt">I’m new to the core PostgreSQL code,so pardon the question if the answer is really obvious, and I’m just missing it, but I’ve got a relatively large webapplication that uses PostgreSQL as a back-end database, and we’re heavily using memcached to cache frequently accesseddata.</span><p class="MsoNormal"><span style="font-size:10.0pt"> </span><p class="MsoNormal"><span style="font-size:10.0pt">I’mlooking at modifying PostgreSQL (in some way) to push changes directly to our memcache servers,in hopes of moving towards a system where only writes are actually sent to the databases, and reads are exclusivelysent to the memcache servers.</span><p class="MsoNormal"><span style="font-size:10.0pt"> </span><p class="MsoNormal"><spanstyle="font-size:10.0pt">I’m guessing that I could intercept the WAL writes, and use this informationto push out to my memcache servers, similar to a replication model.</span><p class="MsoNormal"><span style="font-size:10.0pt"> </span><pclass="MsoNormal"><span style="font-size:10.0pt">Can somebody point to the most logicalplace in the code to intercept the WAL writes? (just a rough direction would be enough)- or if this doesn’t make senseat all, another suggestion on where to get the data? (I’m trying to avoid doing it using triggers).</span><p class="MsoNormal"><spanstyle="font-size:10.0pt"> </span><p class="MsoNormal"><span style="font-size:10.0pt">Thanks,</span><pclass="MsoNormal"><span style="font-size:10.0pt"> </span><p class="MsoNormal"><spanstyle="font-size:10.0pt">Mike</span></div>
On Wed, May 28, 2008 at 7:11 PM, Mike <mike@fonolo.com> wrote: > Can somebody point to the most logical place in the code to intercept the > WAL writes? (just a rough direction would be enough) XLogInsert > or if this doesn't make sense at all, another suggestion on where to get > the data? (I'm trying to avoid doing it using triggers). Without triggers, you don't have many options. With triggers, you could use pg_memcache. If you take it from the WAL, you'll have to do a bit of decoding to make it usable in the context you're looking for, which is quite a bit of work. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
On Wed, 2008-05-28 at 19:11 -0400, Mike wrote: > Can somebody point to the most logical place in the code to intercept > the WAL writes? (just a rough direction would be enough)- or if this > doesn’t make sense at all, another suggestion on where to get the > data? (I’m trying to avoid doing it using triggers). Why are you avoiding triggers? One solution might be to use Slony to just create the log, and then read the log of events into memcached rather than another PostgreSQL instance. http://slony.info/documentation/logshipping.html Those logs might be easier to process than the WAL. Also, why do you need to intercept the WAL writes, and not just read from a WAL archive? Does this need to be synchronous? Regards,Jeff Davis
On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote: > When you say a bit of decoding, is that because the data written to the logs > is after the query parser/planner? Or because it's written in several > chunks? Or? Because that's the actual recovery record. There is no SQL text, just the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE, XLOG_XACT_COMMIT, ...) and the data as it relates to that operation. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote: >> When you say a bit of decoding, is that because the data written to the logs >> is after the query parser/planner? Or because it's written in several >> chunks? Or? > >Because that's the actual recovery record. There is no SQL text, just >the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE, >XLOG_XACT_COMMIT, ...) and the data as it relates to that operation. Oh- right- that makes sense. I installed and started looking at the source code for xlogviewer and xlogdump; seems like a reasonable place to start. Thanks for your help, Mike
And you will have a chance to encounter full page writes, whole page image, which could be produced during the hot backup and the first modification to the data page after a checkpoint (if you turn full page write option "on" by GUC). 2008/5/29 Mike <mike@fonolo.com>: > On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote: >>> When you say a bit of decoding, is that because the data written to the > logs >>> is after the query parser/planner? Or because it's written in several >>> chunks? Or? >> >>Because that's the actual recovery record. There is no SQL text, just >>the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE, >>XLOG_XACT_COMMIT, ...) and the data as it relates to that operation. > > Oh- right- that makes sense. > > I installed and started looking at the source code for xlogviewer and > xlogdump; seems like a reasonable place to start. > > Thanks for your help, > > Mike > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- ------ Koichi Suzuki
On Wed, May 28, 2008 at 7:11 PM, Mike <mike@fonolo.com> wrote: >> Can somebody point to the most logical place in the code to intercept the >> WAL writes? (just a rough direction would be enough) > >XLogInsert > Great- I'll take a look at that code. >> or if this doesn't make sense at all, another suggestion on where to get >> the data? (I'm trying to avoid doing it using triggers). > >Without triggers, you don't have many options. With triggers, you >could use pg_memcache. If you take it from the WAL, you'll have to do >a bit of decoding to make it usable in the context you're looking for, >which is quite a bit of work. > When you say a bit of decoding, is that because the data written to the logs is after the query parser/planner? Or because it's written in several chunks? Or? I turned on WAL_DEBUG; it wasn't immediately helpful, tho I don't know what it means quite yet. I guess perfectly formatted SQL statements would be too easy ;) Mike
On Wed, 2008-05-28 at 19:11 -0400, Mike wrote: > > Can somebody point to the most logical place in the code to intercept > the WAL writes? (just a rough direction would be enough)- or if this > doesn’t make sense at all, another suggestion on where to get the > data? I don't think that intercepting (and then decoding ) WAL is very productive. It is too low level to be of much help. The way I'd do it would be using pgQ from SkyTools package where change events can be queued when happening and then moved in bulk to memcached with not too much effort. Marko Kreen, the primary author of pgQ did a presentation on it at this years pgcon : http://www.pgcon.org/2008/schedule/events/79.en.html You can download the presentation slides to get the overview. > (I’m trying to avoid doing it using triggers). I can't see a good point in avoiding triggers, if you end up doing a lot more work to reconstruct the data which would have been easy to get using a trigger. I've heard that there was and add-on replication solution for Oracle which extracted data from WAL and I was told that it maxed out at about 200 tps. pgQ approach can do much better. --------------- Hannu
>On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote: >> When you say a bit of decoding, is that because the data written to the logs >> is after the query parser/planner? Or because it's written in several >> chunks? Or? > >Because that's the actual recovery record. There is no SQL text, just >the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE, >XLOG_XACT_COMMIT, ...) and the data as it relates to that operation. Hmm... after looking at the xlogdump and xlogview applications, I can see the difficulty pulling this information you were talking about. I see that I can get the OID's for the relations, and then lookup their names from the source database; but pulling where clause and field info seems to be a bit tougher. Is there another place in the code, I can get access to the statements (or statement "like" information), after a transaction commit? Thanks for all your help, Mike
"Mike" <mike@fonolo.com> writes: > Is there another place in the code, I can get access to the statements (or > statement "like" information), after a transaction commit? No. Bear in mind that what you have decided to do amounts to rolling your own replication system. This is a Hard Problem. I would suggest thinking in terms of adapting one of the existing replication projects rather than doing it from scratch. regards, tom lane
On Wed, 2008-05-28 at 21:47 -0400, Mike wrote: > On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote: > >> When you say a bit of decoding, is that because the data written to the > logs > >> is after the query parser/planner? Or because it's written in several > >> chunks? Or? > > > >Because that's the actual recovery record. There is no SQL text, just > >the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE, > >XLOG_XACT_COMMIT, ...) and the data as it relates to that operation. > > Oh- right- that makes sense. Hmm, yes and no. WAL doesn't carry enough information to reconstruct updates or deletes external to the database in question. That's the barrier you need to cross, not just piping the results somewhere. I envision being able to do this in the future and exposing an API to allow it to happen, but we aren't there yet. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Mike schrieb: > > Hello, > > I’m new to the core PostgreSQL code, so pardon the question if the > answer is really obvious, and I’m just missing it, but I’ve got a > relatively large web application that uses PostgreSQL as a back-end > database, and we’re heavily using memcached to cache frequently > accessed data. > > I’m looking at modifying PostgreSQL (in some way) to push changes > directly to our memcache servers, in hopes of moving towards a system > where only writes are actually sent to the databases, and reads are > exclusively sent to the memcache servers. > > I’m guessing that I could intercept the WAL writes, and use this > information to push out to my memcache servers, similar to a > replication model. > > Can somebody point to the most logical place in the code to intercept > the WAL writes? (just a rough direction would be enough)- or if this > doesn’t make sense at all, another suggestion on where to get the > data? (I’m trying to avoid doing it using triggers). > > Thanks, > > Mike > Why not use rules? They are far more easy to use than patching at C Level, and you can simply write some functions at C level and load those as a postgres extension, and interact with MemCache at this level.
On May 29, 2008, at 1:57 AM, Hannu Krosing wrote: > On Wed, 2008-05-28 at 19:11 -0400, Mike wrote: > >> >> Can somebody point to the most logical place in the code to intercept >> the WAL writes? (just a rough direction would be enough)- or if this >> doesn’t make sense at all, another suggestion on where to get the >> data? > > I don't think that intercepting (and then decoding ) WAL is very > productive. It is too low level to be of much help. > > The way I'd do it would be using pgQ from SkyTools package where > change > events can be queued when happening and then moved in bulk to > memcached > with not too much effort. Actually, you might look one step further and see about adding memcached as a subscriber type to londiste; it might be easier than just using PgQ... not that using PgQ would be all that hard. Also, keep in mind that no matter what you do you'll always have a race condition between data in the database and in memcached. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828