Thread: Using incrond for archiving
Hey guys, I've been running some tests while setting up some tiered storage, and I noticed something. Even having an empty 'echo' as archive_command drastically slows down certain operations. For instance: => ALTER TABLE foo SET TABLESPACE slow_tier; ALTER TABLE Time: 3969.962 ms When I set archive_command to anything: => ALTER TABLE foo SET TABLESPACE slow_tier; ALTER TABLE Time: 11969.962 ms I'm guessing it has something to do with the forking code, but I haven't dug into it very deeply yet. I remembered seeing incrond as a way to grab file triggers, and did some tests with an incrontab of this: /db/wal/ IN_CLOSE_WRITE cp -a $@/$# /db/archive/$# Sure enough, files don't appear there until PG closes them after writing. The background writing also doesn't appear to affect speed of my test command. So my real question: is this safe? Supposedly the trigger only gets activated when the xlog file is closed, which only the PG daemon should be doing. I was only testing, so I didn't add a 'test -f' command to prevent overwriting existing archives, but I figured... why bother if there's no future there? I'd say tripling the latency for some database writes is a pretty significant difference, though. I'll defer to the experts in case this is sketchy. :) -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
On 11/11/2011 04:21 PM, Shaun Thomas wrote: > So my real question: is this safe? So to answer my own question: no, it's not safe. The PG backends apparently write to the xlog files periodically and *close* them after doing so. There's no open filehandle that gets written until the file is full and switched to the next one. Knowing that, I used pg_xlogfile_name(pg_current_xlog_location()) to get the most recent xlog file, and wrote a small script incrond would launch. In the script, it gets the current xlog, and will refuse to copy that one. What I don't quite understand is that after calling pg_xlog_switch(), it will sometimes still write to an old xlog several minutes later. Here's an example (0069 is the current xlog): 2011-11-14 15:05:01 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:05:06 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:05:20 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:06:01 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:06:06 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:06:06 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:06:37 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:06:58 : 0000000200000ED500000045 : xlog : copying 2011-11-14 15:07:01 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:07:06 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:07:08 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:07:20 : 0000000200000ED500000064 : xlog : copying 2011-11-14 15:07:24 : 0000000200000ED500000014 : xlog : copying 2011-11-14 15:07:39 : 0000000200000ED500000069 : xlog : too current 2011-11-14 15:07:45 : 0000000200000ED500000061 : xlog : copying 2011-11-14 15:08:01 : 0000000200000ED500000069 : xlog : too current Why on earth is it sending IN_CLOSE_WRITE messages for 0014, 1145, and 0061? Is that just old threads closing their old filehandles after they realize they can't write to that particular xlog? Either way, adding lsof or (ironically much faster pg_current_xlog_location) checking for the most recent xlog to ignore, I can "emulate" PG archive mode using an asynchronous background process. On another note, watching kernel file IO messages is quite fascinating. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Shaun Thomas <sthomas@peak6.com> wrote: > Why on earth is it sending IN_CLOSE_WRITE messages for 0014, 1145, > and 0061? This sounds like it might be another manifestation of something that confused me a while back: http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php -Kevin
On 11/14/2011 03:47 PM, Kevin Grittner wrote: > This sounds like it might be another manifestation of something that > confused me a while back: > > http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php > http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php Interesting. That was probably the case for a couple of the older xlogs. I'm not sure what to think about the ones that were *not* deleted and still being "closed" occasionally. I'm just going to chalk it up to connection turnover, since it seems normal for multiple connections to have connections open to various transaction logs, even it they're not writing to them. I can handle the checks to pg_current_xlog_location so long as it's accurate. If pre-rotated transaction logs are still being written to, it's back to the drawing board for me. :) -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email