Thread: Forcing current WAL file to be archived
Patch included to implement xlog switching, using an xlog record "processing instruction" and forcibly moving xlog pointers. 1. Happens automatically on pg_stop_backup() 2. Can happen manually via pg_switch_xlog() 3. Implement range of utility functions: pg_current_wal_offset() pg_current_xlogfile() pg_current_xlogfile_offset() - for Hannu pg_xlogfile_from_wal_offset() for interpreting output from pg_switch_xlog, pg_start/stop_backup() Passes make check, applies cleanly to HEAD, includes doc patches with clean SGML builds. Design as clean as possible given and has implementation of archive_timeout in mind also. Happy to work further on any code cleanups requested. I've done a variety of testing on it, doing concurrent pg_regress and pg_switch_xlog(). All known issues resolved. Main test cases and sample outputs are in switchtest.sh Wide variety of cases need testing, so I'm expecting some further issues to be reported. I'm now working on completing the restartable recovery patch, which will include further tests of PITR recoveries on the xswitch.patch. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Attachment
Simon Riggs wrote: > Patch included to implement xlog switching, using an xlog record > "processing instruction" and forcibly moving xlog pointers. > > 1. Happens automatically on pg_stop_backup() Oh - so it will not be possible to do an online backup _without_ forcing a WAL switch any more? Laurenz Albe
Albe Laurenz wrote: > Simon Riggs wrote: > >>Patch included to implement xlog switching, using an xlog record >>"processing instruction" and forcibly moving xlog pointers. >> >>1. Happens automatically on pg_stop_backup() > > > Oh - so it will not be possible to do an online backup > _without_ forcing a WAL switch any more? Well, previously, you would have always had to simulate a wal switch, by working out which is the current wal file and copying that. Otherwise your online backup wouldn't be complete. What Simon is describing sounds like a big step forward from that situation. It should let me delete half the code in my pitr backup/failover scripts. Definitely a Good Thing. > Laurenz Albe Tim
Attachment
Tim Allen wrote: >>>Patch included to implement xlog switching, using an xlog record >>>"processing instruction" and forcibly moving xlog pointers. >>> >>>1. Happens automatically on pg_stop_backup() >> >> >> Oh - so it will not be possible to do an online backup >> _without_ forcing a WAL switch any more? > > Well, previously, you would have always had to simulate a wal > switch, by > working out which is the current wal file and copying that. Otherwise > your online backup wouldn't be complete. > > What Simon is describing sounds like a big step forward from that > situation. It should let me delete half the code in my pitr > backup/failover scripts. Definitely a Good Thing. Certainly a Good Thing, and it should be on by default. But couldn't there be situations where you'd like to do an online backup without a WAL switch? To avoid generating an archive WAL every day on a database with few changes, e.g.? Maybe not, I'm just wondering. Laurenz Albe
Albe Laurenz wrote: > Tim Allen wrote: >>>> Patch included to implement xlog switching, using an xlog record >>>> "processing instruction" and forcibly moving xlog pointers. >>>> >>>> 1. Happens automatically on pg_stop_backup() >>> >>> Oh - so it will not be possible to do an online backup >>> _without_ forcing a WAL switch any more? >> Well, previously, you would have always had to simulate a wal >> switch, by >> working out which is the current wal file and copying that. Otherwise >> your online backup wouldn't be complete. >> >> What Simon is describing sounds like a big step forward from that >> situation. It should let me delete half the code in my pitr >> backup/failover scripts. Definitely a Good Thing. > > Certainly a Good Thing, and it should be on by default. > > But couldn't there be situations where you'd like to do an > online backup without a WAL switch? To avoid generating an > archive WAL every day on a database with few changes, e.g.? But the online backup would be impossible to restore, if you don't have enough wal archived to recover past the point where you called pg_stop_backup(). So, doing a wal switch when pg_stop_backup() is called greatly reduces the risk of a user error that leads to broken backups. greetings, Florian Pflug
Albe Laurenz wrote: > Tim Allen wrote: > >>>Patch included to implement xlog switching, using an xlog record > >>>"processing instruction" and forcibly moving xlog pointers. > >>> > >>>1. Happens automatically on pg_stop_backup() > >> > >> > >> Oh - so it will not be possible to do an online backup > >> _without_ forcing a WAL switch any more? > > > > Well, previously, you would have always had to simulate a wal > > switch, by > > working out which is the current wal file and copying that. Otherwise > > your online backup wouldn't be complete. > > > > What Simon is describing sounds like a big step forward from that > > situation. It should let me delete half the code in my pitr > > backup/failover scripts. Definitely a Good Thing. > > Certainly a Good Thing, and it should be on by default. > > But couldn't there be situations where you'd like to do an > online backup without a WAL switch? To avoid generating an > archive WAL every day on a database with few changes, e.g.? > > Maybe not, I'm just wondering. Considering the I/O caused by the backup, a new WAL file seems insignificant, and until a log switch, the backup isn't useful. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Simon Riggs <simon@2ndquadrant.com> writes: > Patch included to implement xlog switching, using an xlog record > "processing instruction" and forcibly moving xlog pointers. Applied with revisions. I didn't like the extra state you added to track whether an xlog switch had occurred --- the more bits of interdependent state the more chance for bugs, IMHO, and it seemed unnecessary since it's easy enough to test whether we are at a segment boundary. I also made the new user-level functions a bit more orthogonal, so that filenames could be extracted from the existing functions like pg_stop_backup. regards, tom lane
On Sat, 2006-08-05 at 23:57 -0400, Tom Lane wrote: > I also made the new user-level functions a bit > more orthogonal, so that filenames could be extracted from the > existing functions like pg_stop_backup. Something Hannu wrote has just reminded me that pg_current_xlog_location() returns the current Insert pointer rather than the current Write pointer. That would not be useful for streaming xlog records would it? Methinks it should be the Write pointer all of the time, since I can't think of a valid reason for wanting to know where the Insert pointer is *before* we've written to the xlog file. Having it be the Insert pointer could lead to some errors. Any objections if I correct that? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Ühel kenal päeval, K, 2006-08-09 kell 12:56, kirjutas Simon Riggs: > On Sat, 2006-08-05 at 23:57 -0400, Tom Lane wrote: > > > I also made the new user-level functions a bit > > more orthogonal, so that filenames could be extracted from the > > existing functions like pg_stop_backup. > > Something Hannu wrote has just reminded me that > pg_current_xlog_location() returns the current Insert pointer rather > than the current Write pointer. > > That would not be useful for streaming xlog records would it? > > Methinks it should be the Write pointer all of the time, since I can't > think of a valid reason for wanting to know where the Insert pointer is > *before* we've written to the xlog file. Having it be the Insert pointer > could lead to some errors. > > Any objections if I correct that? What is the difference ? I'd expect it to point either to last byte written or to the next byte that will be written, and I want to know which one it is :) And another question: is is possible that under some circumstances the last few bytes of a WAL file will not be written to ? or is the writing done as if all the wal files together form one huge tape, without any gaps between ? -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Simon Riggs <simon@2ndquadrant.com> writes: > Something Hannu wrote has just reminded me that > pg_current_xlog_location() returns the current Insert pointer rather > than the current Write pointer. > That would not be useful for streaming xlog records would it? Good point. > Methinks it should be the Write pointer all of the time, since I can't > think of a valid reason for wanting to know where the Insert pointer is > *before* we've written to the xlog file. Having it be the Insert pointer > could lead to some errors. However the start/stop_backup functions return the Insert pointer. I can see scripts getting confused if pg_current_xlog_location reports something less than what they just got from pg_stop_backup. Is there value in exposing both pointers? (Maybe not, it'll just cause confusion probably.) Another option is to have pg_current_xlog_location force a write (but not fsync) as far as the Insert pointer it's about to return. This would eliminate any issues about inconsistency between results, but perhaps there's too much performance penalty. I'm not necessarily against your suggestion, just trying to be sure we've thought about all the options. regards, tom lane
Hannu Krosing <hannu@skype.net> writes: > Ühel kenal päeval, K, 2006-08-09 kell 12:56, kirjutas Simon Riggs: >> Methinks it should be the Write pointer all of the time, since I can't >> think of a valid reason for wanting to know where the Insert pointer is >> *before* we've written to the xlog file. Having it be the Insert pointer >> could lead to some errors. > What is the difference ? Insert points to the next byte to be written within the internal WAL buffers. The byte(s) preceding it haven't necessarily gotten out of those buffers yet. Write points to the end of what we've actually written to the kernel, and there's also a Flush pointer that points to the end of what we believe is down on disk. Simon's point is that if you're going to use pg_current_xlog_location() to control partial shipping of xlog files, you probably want to know about the Write location, because that indicates the limit of what is visible to an external process. regards, tom lane
On Wed, 2006-08-09 at 10:04 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Something Hannu wrote has just reminded me that > > pg_current_xlog_location() returns the current Insert pointer rather > > than the current Write pointer. > > That would not be useful for streaming xlog records would it? > > Good point. > > > Methinks it should be the Write pointer all of the time, since I can't > > think of a valid reason for wanting to know where the Insert pointer is > > *before* we've written to the xlog file. Having it be the Insert pointer > > could lead to some errors. > > However the start/stop_backup functions return the Insert pointer. > I can see scripts getting confused if pg_current_xlog_location reports > something less than what they just got from pg_stop_backup. > > Is there value in exposing both pointers? (Maybe not, it'll just cause > confusion probably.) > > Another option is to have pg_current_xlog_location force a write (but > not fsync) as far as the Insert pointer it's about to return. This > would eliminate any issues about inconsistency between results, but > perhaps there's too much performance penalty. > > I'm not necessarily against your suggestion, just trying to be sure > we've thought about all the options. Hannu's Use Case is to have the function called regularly from an external polling agent. If we don't do the write it could be possible in some circumstances for an XLogWrite to be delayed for some time if we have both large wal_buffers and period of few commits, whereas we might want to have the writes be fairly regular to support regular streaming. So I do see there is a reasonable case for performing a write. The way we have XLogWrite now, I would want that write to be a "flexible" write, which would leave us in the position that calling pg_current_xlog_location() would return something logically between the Insert pointer and the immediately prior Write pointer (even though very often there would be no difference at all). I'm inclined to say we should add a flexible write (i.e. XLogWrite(WriteRqst, true)) to pg_xlogfile_name_offset() and ignore the esoteric difference between the return value and the Insert pointer. I'm not clear how pg_xlogfile_name_offset() would ever return a different answer to pg_stop_backup() - surely we just forcibly moved the Insert and the Write pointer forwards together, so you'll get the same answer from each. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs <simon@2ndquadrant.com> writes: > On Wed, 2006-08-09 at 10:04 -0400, Tom Lane wrote: >> Another option is to have pg_current_xlog_location force a write (but >> not fsync) as far as the Insert pointer it's about to return. This >> would eliminate any issues about inconsistency between results, but >> perhaps there's too much performance penalty. > Hannu's Use Case is to have the function called regularly from an > external polling agent. If we don't do the write it could be possible in > some circumstances for an XLogWrite to be delayed for some time if we > have both large wal_buffers and period of few commits, whereas we might > want to have the writes be fairly regular to support regular streaming. > So I do see there is a reasonable case for performing a write. Now the other side of that coin is that any commit forces a write anyway. So any data that would hypothetically be written by pg_current_xlog_location would be uncommitted data, which is maybe not so important to write yet? Anyway, it's easy enough for a polling program to force a write if it wants to. > The way we have XLogWrite now, I would want that write to be a > "flexible" write, which would leave us in the position that calling > pg_current_xlog_location() would return something logically between the > Insert pointer and the immediately prior Write pointer (even though very > often there would be no difference at all). I really don't want that; it makes it impossible to define what the function is actually giving you. It's not an "esoteric difference". > I'm not clear how pg_xlogfile_name_offset() would ever return a > different answer to pg_stop_backup() - surely we just forcibly moved the > Insert and the Write pointer forwards together, so you'll get the same > answer from each. Hmm ... I guess given the just-added behavior of forcing an xlog switch, that's probably true now, but it wasn't before. Anyway, after further thought I've concluded that we really should supply something that returns the Insert pointer, as this would be useful for debugging and system-monitoring purposes. It's clear however that we also need something that returns the Write pointer, as that's what's needed for partial log-shipping. So my vote is for two functions, both read-only (and hence not superuser-only). Not sure what to name them exactly. regards, tom lane
On Thu, 2006-08-10 at 08:57 -0400, Tom Lane wrote: > Anyway, after further thought I've concluded that we really should > supply something that returns the Insert pointer, as this would be > useful for debugging and system-monitoring purposes. It's clear however > that we also need something that returns the Write pointer, as that's > what's needed for partial log-shipping. > So my vote is for two > functions, both read-only (and hence not superuser-only). Thats probably the most important consideration. > Not sure > what to name them exactly. pg_current_xlog_location() - gives the write pointer i.e. the offset up to which you can read() the xlog file and trust what it tells you pg_current_wal_insert_pointer() - gives the insert pointer :-) Named sufficiently differently that there is no confusion between them. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Aug 10, 2006, at 7:57 AM, Tom Lane wrote: > Anyway, after further thought I've concluded that we really should > supply something that returns the Insert pointer, as this would be > useful for debugging and system-monitoring purposes. It's clear > however > that we also need something that returns the Write pointer, as that's > what's needed for partial log-shipping. So my vote is for two > functions, both read-only (and hence not superuser-only). Not sure > what to name them exactly. Dumb question... is there any need to be able to get those values in sync (I'm assuming that in the time taken to call two separate functions the value on the second function called could change from what it was when the first function was called)? Should there be a SRF that returns both values? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Ühel kenal päeval, K, 2006-08-09 kell 10:57, kirjutas Tom Lane: > Hannu Krosing <hannu@skype.net> writes: > > Ühel kenal päeval, K, 2006-08-09 kell 12:56, kirjutas Simon Riggs: > >> Methinks it should be the Write pointer all of the time, since I can't > >> think of a valid reason for wanting to know where the Insert pointer is > >> *before* we've written to the xlog file. Having it be the Insert pointer > >> could lead to some errors. > > > What is the difference ? > > Insert points to the next byte to be written within the internal WAL > buffers. The byte(s) preceding it haven't necessarily gotten out of > those buffers yet. Write points to the end of what we've actually > written to the kernel, I assume that it also points to the byte after what is written to kernel, or is it tha last byte written ? > and there's also a Flush pointer that points > to the end of what we believe is down on disk. > > Simon's point is that if you're going to use pg_current_xlog_location() > to control partial shipping of xlog files, you probably want to know > about the Write location, because that indicates the limit of what > is visible to an external process. Yes, that is what I need > regards, tom lane -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Hannu Krosing <hannu@skype.net> writes: > Ühel kenal päeval, K, 2006-08-09 kell 10:57, kirjutas Tom Lane: >> Insert points to the next byte to be written within the internal WAL >> buffers. The byte(s) preceding it haven't necessarily gotten out of >> those buffers yet. Write points to the end of what we've actually >> written to the kernel, > I assume that it also points to the byte after what is written to > kernel, or is it tha last byte written ? Right, it's really first-unwritten-byte for all three pointers. The two newly added functions to convert WAL locations to filenames use XLByteToPrevSeg(), so they should do the right thing here (see comments in src/include/access/xlog_internal.h). regards, tom lane
Ühel kenal päeval, L, 2006-08-12 kell 10:59, kirjutas Tom Lane: > Hannu Krosing <hannu@skype.net> writes: > > Ühel kenal päeval, K, 2006-08-09 kell 10:57, kirjutas Tom Lane: > >> Insert points to the next byte to be written within the internal WAL > >> buffers. The byte(s) preceding it haven't necessarily gotten out of > >> those buffers yet. Write points to the end of what we've actually > >> written to the kernel, > > > I assume that it also points to the byte after what is written to > > kernel, or is it tha last byte written ? > > Right, it's really first-unwritten-byte for all three pointers. > The two newly added functions to convert WAL locations to filenames > use XLByteToPrevSeg(), so they should do the right thing here > (see comments in src/include/access/xlog_internal.h). How do they behave exactly at the file boundary ? That is will it point 1 byte past end of old file, or byte 0 of the new one ? > regards, tom lane -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
This issue is closed, right? --------------------------------------------------------------------------- Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Something Hannu wrote has just reminded me that > > pg_current_xlog_location() returns the current Insert pointer rather > > than the current Write pointer. > > That would not be useful for streaming xlog records would it? > > Good point. > > > Methinks it should be the Write pointer all of the time, since I can't > > think of a valid reason for wanting to know where the Insert pointer is > > *before* we've written to the xlog file. Having it be the Insert pointer > > could lead to some errors. > > However the start/stop_backup functions return the Insert pointer. > I can see scripts getting confused if pg_current_xlog_location reports > something less than what they just got from pg_stop_backup. > > Is there value in exposing both pointers? (Maybe not, it'll just cause > confusion probably.) > > Another option is to have pg_current_xlog_location force a write (but > not fsync) as far as the Insert pointer it's about to return. This > would eliminate any issues about inconsistency between results, but > perhaps there's too much performance penalty. > > I'm not necessarily against your suggestion, just trying to be sure > we've thought about all the options. > > regards, tom lane -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > This issue is closed, right? We've agreed we need two functions, but it's not done yet. Seems pretty trivial though ... regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > This issue is closed, right? > > We've agreed we need two functions, but it's not done yet. Seems pretty > trivial though ... OK, that's what I was unclear about. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Sun, 2006-08-13 at 22:50 -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > This issue is closed, right? > > We've agreed we need two functions, but it's not done yet. Seems pretty > trivial though ... Just back from India. I'll work on this tonight. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Fri, 2006-08-11 at 08:04 +0100, Simon Riggs wrote: > On Thu, 2006-08-10 at 08:57 -0400, Tom Lane wrote: > > > Anyway, after further thought I've concluded that we really should > > supply something that returns the Insert pointer, as this would be > > useful for debugging and system-monitoring purposes. It's clear however > > that we also need something that returns the Write pointer, as that's > > what's needed for partial log-shipping. > > > So my vote is for two > > functions, both read-only (and hence not superuser-only). > > Thats probably the most important consideration. > > > Not sure > > what to name them exactly. > > pg_current_xlog_location() - gives the write pointer i.e. the offset up > to which you can read() the xlog file and trust what it tells you > > pg_current_wal_insert_pointer() - gives the insert pointer :-) > > Named sufficiently differently that there is no confusion between them. Patch implementing the above attached. Sample execution, with commentary at bottom. postgres=# select pg_current_wal_insert_pointer(), pg_current_xlog_location(); pg_current_wal_insert_pointer | pg_current_xlog_location -------------------------------+-------------------------- 0/3A0824 | 0/3A0824 (1 row) postgres=# begin;insert into blah values (1); BEGIN INSERT 0 1 postgres=# select pg_current_wal_insert_pointer(), pg_current_xlog_location(); pg_current_wal_insert_pointer | pg_current_xlog_location -------------------------------+-------------------------- 0/3A085C | 0/3A0824 (1 row) postgres=# insert into blah values (1); INSERT 0 1 postgres=# select pg_current_wal_insert_pointer(), pg_current_xlog_location(); pg_current_wal_insert_pointer | pg_current_xlog_location -------------------------------+-------------------------- 0/3A0894 | 0/3A0824 (1 row) postgres=# commit; COMMIT postgres=# select pg_current_wal_insert_pointer(), pg_current_xlog_location(); pg_current_wal_insert_pointer | pg_current_xlog_location -------------------------------+-------------------------- 0/3A08BC | 0/3A08BC (1 row) postgres=# select pg_switch_xlog(); pg_switch_xlog ---------------- 0/3A091C (1 row) postgres=# select pg_current_wal_insert_pointer(), pg_current_xlog_location(); pg_current_wal_insert_pointer | pg_current_xlog_location -------------------------------+-------------------------- 0/1000020 | 0/1000000 (1 row) postgres=# select pg_xlogfile_name_offset(pg_current_xlog_location()); pg_xlogfile_name_offset ----------------------------------- 000000010000000000000000 16777216 (1 row) The above shows that the Insert pointer is always ahead of or the same as the Write pointer. After a log switch the current location is shown as being in the next file, though the current filename still shows as the previous filename (since there has been no write activity yet on the new file) with an offset of 1 beyond EOF, to indicate that the whole file may now be read. pg_switch_xlog() shows the next-to-be written byte in the file that we have just switched out of, or the current location if we just performed a log switch. So the following sequence does *not* show there is an error in the returned pointer values. postgres=# insert into blah values (1); INSERT 0 1 postgres=# select pg_xlogfile_name_offset(pg_current_xlog_location()); pg_xlogfile_name_offset ------------------------------ 000000010000000000000001 372 (1 row) postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); pg_xlogfile_name_offset ------------------------------ 000000010000000000000001 400 (1 row) ...a log switch was performed postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); pg_xlogfile_name_offset ----------------------------------- 000000010000000000000001 16777216 (1 row) ...a log switch was *not* performed, since we're already at EOF I've not taken up Jim Nasby's suggestion to make this an SRF with multiple return rows/columns since that much complexity isn't justified IMHO. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Attachment
Simon Riggs wrote: > postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); > pg_xlogfile_name_offset > ----------------------------------- > 000000010000000000000001 16777216 > (1 row) > I've not taken up Jim Nasby's suggestion to make this an SRF with > multiple return rows/columns since that much complexity isn't justified > IMHO. Hum, but two columns here seem warranted, don't they? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, 2006-08-15 at 11:10 -0400, Alvaro Herrera wrote: > Simon Riggs wrote: > > > > > postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); > > pg_xlogfile_name_offset > > ----------------------------------- > > 000000010000000000000001 16777216 > > (1 row) > > > I've not taken up Jim Nasby's suggestion to make this an SRF with > > multiple return rows/columns since that much complexity isn't justified > > IMHO. > > Hum, but two columns here seem warranted, don't they? Maybe. People can write any function they like though, so I'm loathe to agonize over this too much. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Tue, Aug 15, 2006 at 06:07:12PM +0100, Simon Riggs wrote: > On Tue, 2006-08-15 at 11:10 -0400, Alvaro Herrera wrote: > > Simon Riggs wrote: > > > > > > > > > postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); > > > pg_xlogfile_name_offset > > > ----------------------------------- > > > 000000010000000000000001 16777216 > > > (1 row) > > > > > I've not taken up Jim Nasby's suggestion to make this an SRF with > > > multiple return rows/columns since that much complexity isn't justified > > > IMHO. > > > > Hum, but two columns here seem warranted, don't they? > > Maybe. People can write any function they like though, so I'm loathe to > agonize over this too much. True, but making people parse the output of a function to seperate the two fields seems pretty silly. Is there some reason why pg_xlogfile_name_offset shouldn't be a SRF, or use two out parameters? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, 2006-08-15 at 12:13 -0500, Jim C. Nasby wrote: > On Tue, Aug 15, 2006 at 06:07:12PM +0100, Simon Riggs wrote: > > On Tue, 2006-08-15 at 11:10 -0400, Alvaro Herrera wrote: > > > Simon Riggs wrote: > > > > > > > > > > > > > postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); > > > > pg_xlogfile_name_offset > > > > ----------------------------------- > > > > 000000010000000000000001 16777216 > > > > (1 row) > > > > > > > I've not taken up Jim Nasby's suggestion to make this an SRF with > > > > multiple return rows/columns since that much complexity isn't justified > > > > IMHO. > > > > > > Hum, but two columns here seem warranted, don't they? > > > > Maybe. People can write any function they like though, so I'm loathe to > > agonize over this too much. > > True, but making people parse the output of a function to seperate the > two fields seems pretty silly. Is there some reason why > pg_xlogfile_name_offset shouldn't be a SRF, or use two out parameters? If this makes a difference, then I'll do it. Does it make a difference? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Tue, Aug 15, 2006 at 07:11:24PM +0100, Simon Riggs wrote: > On Tue, 2006-08-15 at 12:13 -0500, Jim C. Nasby wrote: > > On Tue, Aug 15, 2006 at 06:07:12PM +0100, Simon Riggs wrote: > > > On Tue, 2006-08-15 at 11:10 -0400, Alvaro Herrera wrote: > > > > Simon Riggs wrote: > > > > > > > > > > > > > > > > > postgres=# select pg_xlogfile_name_offset(pg_switch_xlog()); > > > > > pg_xlogfile_name_offset > > > > > ----------------------------------- > > > > > 000000010000000000000001 16777216 > > > > > (1 row) > > > > > > > > > I've not taken up Jim Nasby's suggestion to make this an SRF with > > > > > multiple return rows/columns since that much complexity isn't justified > > > > > IMHO. > > > > > > > > Hum, but two columns here seem warranted, don't they? > > > > > > Maybe. People can write any function they like though, so I'm loathe to > > > agonize over this too much. > > > > True, but making people parse the output of a function to seperate the > > two fields seems pretty silly. Is there some reason why > > pg_xlogfile_name_offset shouldn't be a SRF, or use two out parameters? > > If this makes a difference, then I'll do it. Does it make a difference? Well, many languages make it easier to grab data from seperate fields than to parse out the contents of the field, and even on ones that don't it's not like it's hard to combine the two fields together like pg_xlogfile_name_offset() does right now. But more to the point, I can't see any use case for combining them together... if you want both pieces of info, you want them for different reasons, so cramming them together doesn't make any sense to me. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
"Jim C. Nasby" <jnasby@pervasive.com> writes: > True, but making people parse the output of a function to seperate the > two fields seems pretty silly. Is there some reason why > pg_xlogfile_name_offset shouldn't be a SRF, or use two out parameters? It'd definitely be nicer that way, but given the current limitations of bootstrap mode I see no non-kluge way to make a built-in function have OUT parameters. (Hint: array_in doesn't work in bootstrap mode.) And the other alternatives like a predefined complex type seem even more painful. If you can think of a way to do this that has pain not out of proportion to the gain, then I'm all for it ... regards, tom lane
I wrote: > It'd definitely be nicer that way, but given the current limitations of > bootstrap mode I see no non-kluge way to make a built-in function have > OUT parameters. (Hint: array_in doesn't work in bootstrap mode.) Actually, that turns out not to be so hard to fix as I thought. array_in only needs to work for the array types used in the core system tables, and bootstrap.c already has a hard-wired table of that info ... we only have to make it available to array_in. Which I just did. So let's fix pg_xlogfile_name_offset() to have two OUT parameters instead of returning a smushed-together string. The reason I knew about the array_in problem was I'd tried to make some other built-in function have OUT parameters ... I think it was probably one of the ones that we currently have underneath system views. It might be worthwhile converting some or all of these to use OUT parameters and not need the crutch of an AS clause in the view: pg_show_all_settings pg_lock_status pg_prepared_xact pg_stat_file pg_prepared_statement pg_cursor pg_timezonenames regards, tom lane
On Tue, 2006-08-15 at 18:42 -0400, Tom Lane wrote: > I wrote: > > It'd definitely be nicer that way, but given the current limitations of > > bootstrap mode I see no non-kluge way to make a built-in function have > > OUT parameters. (Hint: array_in doesn't work in bootstrap mode.) > > Actually, that turns out not to be so hard to fix as I thought. > array_in only needs to work for the array types used in the core system > tables, and bootstrap.c already has a hard-wired table of that info ... > we only have to make it available to array_in. Which I just did. Cool; I'd noticed that this would have been the first such function. > So let's fix pg_xlogfile_name_offset() to have two OUT parameters > instead of returning a smushed-together string. I'll do this, but I'm conscious that this is a cosmetic change. I'm going on vacation very soon now, so test reports of the major functionality would be greatly appreciated. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs <simon@2ndquadrant.com> writes: > On Tue, 2006-08-15 at 18:42 -0400, Tom Lane wrote: >> So let's fix pg_xlogfile_name_offset() to have two OUT parameters >> instead of returning a smushed-together string. > I'll do this, but I'm conscious that this is a cosmetic change. Well, it's cosmetic, but it's also an API change, which means that this is our only opportunity to get it right. Once these functions are in a release it will be too hard to change them. regards, tom lane
Simon Riggs <simon@2ndquadrant.com> writes: > We want a single row output, with two columns, yes? > Presumably: > xlogfilename TEXT > offset INTEGER Sounds right to me. int4 should be wide enough for practical xlog segment sizes. regards, tom lane
On Wed, 2006-08-16 at 08:51 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Tue, 2006-08-15 at 18:42 -0400, Tom Lane wrote: > >> So let's fix pg_xlogfile_name_offset() to have two OUT parameters > >> instead of returning a smushed-together string. > > > I'll do this, but I'm conscious that this is a cosmetic change. > > Well, it's cosmetic, but it's also an API change, which means that this > is our only opportunity to get it right. Once these functions are in a > release it will be too hard to change them. I've just started working this part, now I have the rest complete. We want a single row output, with two columns, yes? Presumably: xlogfilename TEXT offset INTEGER -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Wed, 2006-08-16 at 11:45 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > We want a single row output, with two columns, yes? > > Presumably: > > xlogfilename TEXT > > offset INTEGER > > Sounds right to me. int4 should be wide enough for practical xlog > segment sizes. Wise one: what should my pg_proc look like? I'm the lucky man to break the "_null_ _null_ _null_" rule... I've tried DATA(insert OID = 2850 ( pg_xlogfile_name_offset PGNSP PGUID 12 f f t f i 1 2249 "25" "25 25 23" "i o o" _null_ pg_xlogfile_name_offset - _null_ )); but my initdb fails with selecting default shared_buffers/max_fsm_pages ... 20000kB/1000000 creating configuration files ... ok creating template1 database in a/base/1 ... FATAL: cache lookup failed for type 26 child process exited with exit code 1 initdb: removing data directory "a" Thinking this might be an 0-referenced array issue, I also tried "24 24 22" in the above, but that bombs with the same error. Currently, if I just leave it as it is, then initdb runs but then hangs/bombs when you invokle the function (as you might expect). As far as I can tell, the function isn't ever called correctly without this... copied here for info. /* * Compute an xlog file name and decimal byte offset given a WAL location, * such as is returned by pg_stop_backup() or pg_xlog_switch(). * * Note that a location exactly at a segment boundary is taken to be in * the previous segment. This is usually the right thing, since the * expected usage is to determine which xlog file(s) are ready to archive. */ Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS) { text *location = PG_GETARG_TEXT_P(0); char *locationstr; unsigned int uxlogid; unsigned int uxrecoff; uint32 xlogid; uint32 xlogseg; uint32 xrecoff; XLogRecPtr locationpoint; char xlogfilename[MAXFNAMELEN]; TupleDesc returnTupleDesc; Datum values[2]; bool isnull[2]; HeapTuple returnHeapTuple; Datum result; /* * Read input and parse */ locationstr = DatumGetCString(DirectFunctionCall1(textout, PointerGetDatum(location))); if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("could not parse xlog location \"%s\"", locationstr))); locationpoint.xlogid = uxlogid; locationpoint.xrecoff = uxrecoff; /* Construct a tuple descriptor for the result rows. */ returnTupleDesc = CreateTemplateTupleDesc(2, false); TupleDescInitEntry(returnTupleDesc, (AttrNumber) 1, "xlogfilename", TEXTOID, -1, 0); TupleDescInitEntry(returnTupleDesc, (AttrNumber) 2, "offset", INT4OID, -1, 0); returnTupleDesc = BlessTupleDesc(returnTupleDesc); /* * xlogfilename */ XLByteToPrevSeg(locationpoint, xlogid, xlogseg); XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg); values[0] = PointerGetDatum(xlogfilename); isnull[0] = false; /* * offset */ xrecoff = locationpoint.xrecoff - xlogseg * XLogSegSize; values[1] = UInt32GetDatum(xrecoff); isnull[1] = false; /* * Tuple jam: Having first prepared your Datums, then squash together */ returnHeapTuple = heap_form_tuple(returnTupleDesc, values, isnull); result = HeapTupleGetDatum(returnHeapTuple); PG_RETURN_DATUM(result); } -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs <simon@2ndquadrant.com> writes: > but my initdb fails with > creating template1 database in a/base/1 ... FATAL: cache lookup failed > for type 26 Um ... when did you last "cvs update"? That was the behavior up till I fixed array_in for bootstrap mode, yesterday afternoon ... regards, tom lane
On Wed, 2006-08-16 at 16:51 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > but my initdb fails with > > > creating template1 database in a/base/1 ... FATAL: cache lookup failed > > for type 26 > > Um ... when did you last "cvs update"? That was the behavior up till I > fixed array_in for bootstrap mode, yesterday afternoon ... Sounds like it must be so. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs <simon@2ndquadrant.com> writes: > Wise one: what should my pg_proc look like? > DATA(insert OID = 2850 ( pg_xlogfile_name_offset PGNSP PGUID 12 f f t f > i 1 2249 "25" "25 25 23" "i o o" _null_ pg_xlogfile_name_offset - > _null_ )); Oh, as far as that goes, the array columns need to look like something array_in will eat; and you should provide parameter names so that "select * from" will produce useful headings. So probably more like DATA(insert OID = 2850 ( pg_xlogfile_name_offset PGNSP PGUID 12 f f t f i 1 2249 "25" "{25,25,23}" "{i,o,o}" "{wal_offset,filename,offset}"pg_xlogfile_name_offset - _null_ )); I think you can get away without inner quotes (ie, not "{'i','o','o'}") as long as you aren't using anything weird like spaces in a parameter name. regards, tom lane
On Wed, 2006-08-16 at 17:09 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Wise one: what should my pg_proc look like? > > > DATA(insert OID = 2850 ( pg_xlogfile_name_offset PGNSP PGUID 12 f f t f > > i 1 2249 "25" "25 25 23" "i o o" _null_ pg_xlogfile_name_offset - > > _null_ )); > > Oh, as far as that goes, the array columns need to look like something > array_in will eat; and you should provide parameter names so that > "select * from" will produce useful headings. So probably more like > > DATA(insert OID = 2850 ( pg_xlogfile_name_offset PGNSP PGUID 12 f f t f i 1 2249 "25" "{25,25,23}" "{i,o,o}" "{wal_offset,filename,offset}"pg_xlogfile_name_offset - _null_ )); > > I think you can get away without inner quotes (ie, not "{'i','o','o'}") > as long as you aren't using anything weird like spaces in a parameter name. archive_timeout++.patch re-submitted on other thread, now including these changes also. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com