Thread: intercepting WAL writes

intercepting WAL writes

From
"Mike"
Date:
<div class="Section1"><p class="MsoNormal"><span style="font-size:10.0pt">Hello,</span><p class="MsoNormal"><span
style="font-size:10.0pt"> </span><pclass="MsoNormal"><span style="font-size:10.0pt">I’m new to the core PostgreSQL
code,so pardon the question if the answer is really obvious, and I’m just missing it, but I’ve got a relatively large
webapplication that uses PostgreSQL as a back-end database, and we’re heavily using memcached to cache frequently
accesseddata.</span><p class="MsoNormal"><span style="font-size:10.0pt"> </span><p class="MsoNormal"><span
style="font-size:10.0pt">I’mlooking at modifying PostgreSQL (in some way) to push changes directly to our memcache
servers,in hopes of moving towards a system where only writes are actually sent to the databases, and reads are
exclusivelysent to the memcache servers.</span><p class="MsoNormal"><span style="font-size:10.0pt"> </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt">I’m guessing that I could intercept the WAL writes, and use this
informationto push out to my memcache servers, similar to a replication model.</span><p class="MsoNormal"><span
style="font-size:10.0pt"> </span><pclass="MsoNormal"><span style="font-size:10.0pt">Can somebody point to the most
logicalplace in the code to intercept the WAL writes? (just a rough direction would be enough)- or if this doesn’t make
senseat all, another suggestion on where to get the data? (I’m trying to avoid doing it using triggers).</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt"> </span><p class="MsoNormal"><span
style="font-size:10.0pt">Thanks,</span><pclass="MsoNormal"><span style="font-size:10.0pt"> </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt">Mike</span></div> 

Re: intercepting WAL writes

From
"Jonah H. Harris"
Date:
On Wed, May 28, 2008 at 7:11 PM, Mike <mike@fonolo.com> wrote:
> Can somebody point to the most logical place in the code to intercept the
> WAL writes? (just a rough direction would be enough)

XLogInsert

> or if this doesn't make sense at all, another suggestion on where to get
> the data? (I'm trying to avoid doing it using triggers).

Without triggers, you don't have many options.  With triggers, you
could use pg_memcache.  If you take it from the WAL, you'll have to do
a bit of decoding to make it usable in the context you're looking for,
which is quite a bit of work.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/


Re: intercepting WAL writes

From
Jeff Davis
Date:
On Wed, 2008-05-28 at 19:11 -0400, Mike wrote:
> Can somebody point to the most logical place in the code to intercept
> the WAL writes? (just a rough direction would be enough)- or if this
> doesn’t make sense at all, another suggestion on where to get the
> data? (I’m trying to avoid doing it using triggers).

Why are you avoiding triggers? One solution might be to use Slony to
just create the log, and then read the log of events into memcached
rather than another PostgreSQL instance.

http://slony.info/documentation/logshipping.html

Those logs might be easier to process than the WAL.

Also, why do you need to intercept the WAL writes, and not just read
from a WAL archive? Does this need to be synchronous?

Regards,Jeff Davis



Re: intercepting WAL writes

From
"Jonah H. Harris"
Date:
On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote:
> When you say a bit of decoding, is that because the data written to the logs
> is after the query parser/planner? Or because it's written in several
> chunks? Or?

Because that's the actual recovery record.  There is no SQL text, just
the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/


Re: intercepting WAL writes

From
"Mike"
Date:
On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote:
>> When you say a bit of decoding, is that because the data written to the
logs
>> is after the query parser/planner? Or because it's written in several
>> chunks? Or?
>
>Because that's the actual recovery record.  There is no SQL text, just
>the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
>XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.

Oh- right- that makes sense.

I installed and started looking at the source code for xlogviewer and
xlogdump; seems like a reasonable place to start.

Thanks for your help,

Mike



Re: intercepting WAL writes

From
"Koichi Suzuki"
Date:
And you will have a chance to encounter full page writes, whole page
image, which could be produced during the hot backup and the first
modification to the data page after a checkpoint (if you turn full
page write option "on" by GUC).

2008/5/29 Mike <mike@fonolo.com>:
> On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote:
>>> When you say a bit of decoding, is that because the data written to the
> logs
>>> is after the query parser/planner? Or because it's written in several
>>> chunks? Or?
>>
>>Because that's the actual recovery record.  There is no SQL text, just
>>the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
>>XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.
>
> Oh- right- that makes sense.
>
> I installed and started looking at the source code for xlogviewer and
> xlogdump; seems like a reasonable place to start.
>
> Thanks for your help,
>
> Mike
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
------
Koichi Suzuki


Re: intercepting WAL writes

From
"Mike"
Date:
On Wed, May 28, 2008 at 7:11 PM, Mike <mike@fonolo.com> wrote:
>> Can somebody point to the most logical place in the code to intercept the
>> WAL writes? (just a rough direction would be enough)
>
>XLogInsert
>

Great- I'll take a look at that code.

>> or if this doesn't make sense at all, another suggestion on where to get
>> the data? (I'm trying to avoid doing it using triggers).
>
>Without triggers, you don't have many options.  With triggers, you
>could use pg_memcache.  If you take it from the WAL, you'll have to do
>a bit of decoding to make it usable in the context you're looking for,
>which is quite a bit of work.
>

When you say a bit of decoding, is that because the data written to the logs
is after the query parser/planner? Or because it's written in several
chunks? Or?

I turned on WAL_DEBUG; it wasn't immediately helpful, tho I don't know what
it means quite yet. I guess perfectly formatted SQL statements would be too
easy ;)

Mike



Re: intercepting WAL writes

From
Hannu Krosing
Date:
On Wed, 2008-05-28 at 19:11 -0400, Mike wrote:

> 
> Can somebody point to the most logical place in the code to intercept
> the WAL writes? (just a rough direction would be enough)- or if this
> doesn’t make sense at all, another suggestion on where to get the
> data? 

I don't think that intercepting (and then decoding ) WAL is very
productive. It is too low level to be of much help.

The way I'd do it would be using pgQ from SkyTools package where change
events can be queued when happening and then moved in bulk to memcached
with not too much effort.

Marko Kreen, the primary author of pgQ did a presentation on it at this
years pgcon :

http://www.pgcon.org/2008/schedule/events/79.en.html

You can download the presentation slides to get the overview.


> (I’m trying to avoid doing it using triggers).

I can't see a good point in avoiding triggers, if you end up doing a lot
more work to reconstruct the data which would have been easy to get
using a trigger.

I've heard that there was and add-on replication solution for Oracle
which extracted data from WAL and I was told that it maxed out at about
200 tps. pgQ approach can do much better.

---------------
Hannu




Re: intercepting WAL writes

From
"Mike"
Date:
>On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote:
>> When you say a bit of decoding, is that because the data written to the
logs
>> is after the query parser/planner? Or because it's written in several
>> chunks? Or?
>
>Because that's the actual recovery record.  There is no SQL text, just
>the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
>XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.

Hmm... after looking at the xlogdump and xlogview applications, I can see
the difficulty pulling this information you were talking about.

I see that I can get the OID's for the relations, and then lookup their
names from the source database; but pulling where clause and field info
seems to be a bit tougher.

Is there another place in the code, I can get access to the statements (or
statement "like" information), after a transaction commit?

Thanks for all your help,

Mike



Re: intercepting WAL writes

From
Tom Lane
Date:
"Mike" <mike@fonolo.com> writes:
> Is there another place in the code, I can get access to the statements (or
> statement "like" information), after a transaction commit?

No.

Bear in mind that what you have decided to do amounts to rolling your
own replication system.  This is a Hard Problem.  I would suggest
thinking in terms of adapting one of the existing replication projects
rather than doing it from scratch.
        regards, tom lane


Re: intercepting WAL writes

From
Simon Riggs
Date:
On Wed, 2008-05-28 at 21:47 -0400, Mike wrote:
> On Wed, May 28, 2008 at 8:30 PM, Mike <mike@fonolo.com> wrote:
> >> When you say a bit of decoding, is that because the data written to the
> logs
> >> is after the query parser/planner? Or because it's written in several
> >> chunks? Or?
> >
> >Because that's the actual recovery record.  There is no SQL text, just
> >the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
> >XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.
> 
> Oh- right- that makes sense.

Hmm, yes and no.

WAL doesn't carry enough information to reconstruct updates or deletes
external to the database in question. That's the barrier you need to
cross, not just piping the results somewhere.

I envision being able to do this in the future and exposing an API to
allow it to happen, but we aren't there yet.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: intercepting WAL writes

From
Mario Weilguni
Date:
Mike schrieb:
>
> Hello,
>
> I’m new to the core PostgreSQL code, so pardon the question if the 
> answer is really obvious, and I’m just missing it, but I’ve got a 
> relatively large web application that uses PostgreSQL as a back-end 
> database, and we’re heavily using memcached to cache frequently 
> accessed data.
>
> I’m looking at modifying PostgreSQL (in some way) to push changes 
> directly to our memcache servers, in hopes of moving towards a system 
> where only writes are actually sent to the databases, and reads are 
> exclusively sent to the memcache servers.
>
> I’m guessing that I could intercept the WAL writes, and use this 
> information to push out to my memcache servers, similar to a 
> replication model.
>
> Can somebody point to the most logical place in the code to intercept 
> the WAL writes? (just a rough direction would be enough)- or if this 
> doesn’t make sense at all, another suggestion on where to get the 
> data? (I’m trying to avoid doing it using triggers).
>
> Thanks,
>
> Mike
>
Why not use rules? They are far more easy to use than patching at C 
Level, and you can simply write some functions at C level and load those 
as a postgres extension, and interact with MemCache at this level.


Re: intercepting WAL writes

From
Decibel!
Date:
On May 29, 2008, at 1:57 AM, Hannu Krosing wrote:
> On Wed, 2008-05-28 at 19:11 -0400, Mike wrote:
>
>>
>> Can somebody point to the most logical place in the code to intercept
>> the WAL writes? (just a rough direction would be enough)- or if this
>> doesn’t make sense at all, another suggestion on where to get the
>> data?
>
> I don't think that intercepting (and then decoding ) WAL is very
> productive. It is too low level to be of much help.
>
> The way I'd do it would be using pgQ from SkyTools package where
> change
> events can be queued when happening and then moved in bulk to
> memcached
> with not too much effort.


Actually, you might look one step further and see about adding
memcached as a subscriber type to londiste; it might be easier than
just using PgQ... not that using PgQ would be all that hard.

Also, keep in mind that no matter what you do you'll always have a
race condition between data in the database and in memcached.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828