Thread: xlog location arithmetic
Hi, A while ago when blogging about WAL [1], I noticed a function to deal with xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and after some questions during trainings and conferences I decided to translate my shell script function in C. The attached patch implements the function pg_xlog_location_diff (bikeshed colors are welcome). It calculates the difference between two given transaction log locations. Now that we have pg_stat_replication view, it will be easy to get the lag just passing columns as parameters. Also, the monitoring tools could take advantage of it instead of relying on a fragile routine to get the lag. I noticed that pg_xlogfile_name* functions does not sanity check the xrecoff boundaries but that is material for another patch. [1] http://eulerto.blogspot.com/2011/11/understanding-wal-nomenclature.html [2] http://www.depesz.com/index.php/2011/01/24/waiting-for-9-1-pg_stat_replication/ -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachment
On Tue, Dec 6, 2011 at 05:19, Euler Taveira de Oliveira <euler@timbira.com> wrote: > Hi, > > A while ago when blogging about WAL [1], I noticed a function to deal with > xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and > after some questions during trainings and conferences I decided to translate > my shell script function in C. > > The attached patch implements the function pg_xlog_location_diff (bikeshed > colors are welcome). It calculates the difference between two given > transaction log locations. Now that we have pg_stat_replication view, it will > be easy to get the lag just passing columns as parameters. Also, the > monitoring tools could take advantage of it instead of relying on a fragile > routine to get the lag. I've been considering similar things, as you can find in the archives, but what I was thinking of was converting the number to just a plain bigint, then letting the user apply whatever arithmetic wanted at the SQL level. I never got around to acutally coding it, though. It could easily be extracted from your patch of course - and I think that's a more flexible approach. Is there some advantage to your method that I'm missing? Also, why do you use DirectFunctionCall to do the simple math, and not just do the math right there in the function? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: > I've been considering similar things, as you can find in the archives, > but what I was thinking of was converting the number to just a plain > bigint, then letting the user apply whatever arithmetic wanted at the > SQL level. I never got around to acutally coding it, though. It could > easily be extracted from your patch of course - and I think that's a > more flexible approach. Is there some advantage to your method that > I'm missing? I went so far as to put together an lsn data type. I didn't actually get all that far with it, which is why I haven't posted it sooner, but here's what I came up with. It's missing indexing support and stuff, but that could be added if people like the approach. It solves this problem by implementing -(lsn,lsn) => numeric (not int8, that can overflow since it is not unsigned), which allows an lsn => numeric conversion by just subtracting '0/0'::lsn. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On 06-12-2011 07:14, Magnus Hagander wrote: > On Tue, Dec 6, 2011 at 05:19, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> Hi, >> >> A while ago when blogging about WAL [1], I noticed a function to deal with >> xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and >> after some questions during trainings and conferences I decided to translate >> my shell script function in C. >> >> The attached patch implements the function pg_xlog_location_diff (bikeshed >> colors are welcome). It calculates the difference between two given >> transaction log locations. Now that we have pg_stat_replication view, it will >> be easy to get the lag just passing columns as parameters. Also, the >> monitoring tools could take advantage of it instead of relying on a fragile >> routine to get the lag. > > I've been considering similar things, as you can find in the archives, > but what I was thinking of was converting the number to just a plain > bigint, then letting the user apply whatever arithmetic wanted at the > SQL level. I never got around to acutally coding it, though. It could > easily be extracted from your patch of course - and I think that's a > more flexible approach. Is there some advantage to your method that > I'm missing? > The only advantage is that you don't expose the arithmetic, e.g., user doesn't need to know the xlog internals (like I described in a recent blog post). If one day we consider changes in xlog arithmetic (for example, XLogFileSize), we don't need to worry too much about external tools. > Also, why do you use DirectFunctionCall to do the simple math, and not > just do the math right there in the function? > I use it because I don't want to duplicate the overflow code. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On 06-12-2011 13:11, Robert Haas wrote: > On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: >> I've been considering similar things, as you can find in the archives, >> but what I was thinking of was converting the number to just a plain >> bigint, then letting the user apply whatever arithmetic wanted at the >> SQL level. I never got around to acutally coding it, though. It could >> easily be extracted from your patch of course - and I think that's a >> more flexible approach. Is there some advantage to your method that >> I'm missing? > > I went so far as to put together an lsn data type. I didn't actually > get all that far with it, which is why I haven't posted it sooner, but > here's what I came up with. It's missing indexing support and stuff, > but that could be added if people like the approach. It solves this > problem by implementing -(lsn,lsn) => numeric (not int8, that can > overflow since it is not unsigned), which allows an lsn => numeric > conversion by just subtracting '0/0'::lsn. > Interesting approach. I don't want to go that far. If so, you want to change all of those functions that deal with LSNs and add some implicit conversion between text and lsn data types (for backward compatibility). As of int8, I'm not aware of any modern plataform that int8 is not 64 bits. I'm not against numeric use; I'm just saying that int8 is sufficient. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 06-12-2011 13:11, Robert Haas wrote: >> On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: >>> I've been considering similar things, as you can find in the archives, >>> but what I was thinking of was converting the number to just a plain >>> bigint, then letting the user apply whatever arithmetic wanted at the >>> SQL level. I never got around to acutally coding it, though. It could >>> easily be extracted from your patch of course - and I think that's a >>> more flexible approach. Is there some advantage to your method that >>> I'm missing? >> >> I went so far as to put together an lsn data type. I didn't actually >> get all that far with it, which is why I haven't posted it sooner, but >> here's what I came up with. It's missing indexing support and stuff, >> but that could be added if people like the approach. It solves this >> problem by implementing -(lsn,lsn) => numeric (not int8, that can >> overflow since it is not unsigned), which allows an lsn => numeric >> conversion by just subtracting '0/0'::lsn. >> > Interesting approach. I don't want to go that far. If so, you want to change > all of those functions that deal with LSNs and add some implicit conversion > between text and lsn data types (for backward compatibility). As of int8, I'm > not aware of any modern plataform that int8 is not 64 bits. I'm not against > numeric use; I'm just saying that int8 is sufficient. The point isn't that int8 might not be 64 bits - of course it has to be 64 bits; that's why it's called int8 i.e. 8 bytes. The point is that a large enough LSN, represented as an int8, will come out as a negative values. int8 can only represent 2^63 *non-negative* values, because one bit is reserved for sign. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Dec 6, 2011, at 12:06 PM, Robert Haas wrote: > On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> On 06-12-2011 13:11, Robert Haas wrote: >>> On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>> I've been considering similar things, as you can find in the archives, >>>> but what I was thinking of was converting the number to just a plain >>>> bigint, then letting the user apply whatever arithmetic wanted at the >>>> SQL level. I never got around to acutally coding it, though. It could >>>> easily be extracted from your patch of course - and I think that's a >>>> more flexible approach. Is there some advantage to your method that >>>> I'm missing? >>> >>> I went so far as to put together an lsn data type. I didn't actually >>> get all that far with it, which is why I haven't posted it sooner, but >>> here's what I came up with. It's missing indexing support and stuff, >>> but that could be added if people like the approach. It solves this >>> problem by implementing -(lsn,lsn) => numeric (not int8, that can >>> overflow since it is not unsigned), which allows an lsn => numeric >>> conversion by just subtracting '0/0'::lsn. >>> >> Interesting approach. I don't want to go that far. If so, you want to change >> all of those functions that deal with LSNs and add some implicit conversion >> between text and lsn data types (for backward compatibility). As of int8, I'm >> not aware of any modern plataform that int8 is not 64 bits. I'm not against >> numeric use; I'm just saying that int8 is sufficient. > > The point isn't that int8 might not be 64 bits - of course it has to > be 64 bits; that's why it's called int8 i.e. 8 bytes. The point is > that a large enough LSN, represented as an int8, will come out as a > negative values. int8 can only represent 2^63 *non-negative* values, > because one bit is reserved for sign. I've often wondered about adding uint2/4/8... I suspect it's actually pretty uncommon for people to put negative numbersinto int fields, since one of their biggest uses seems to be surrogate keys. I realize that this opens a can of worms with casting, but perhaps that can be kept under control by not doing any implicitcasting between int and uint... that just means that we'd have to be smart about casting from unknown, but hopefullythat's doable since we already have a similar concern with casting unknown to int2/4/8 vs numeric? -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Tue, Dec 13, 2011 at 12:48 PM, Jim Nasby <jim@nasby.net> wrote: > I've often wondered about adding uint2/4/8... I suspect it's actually pretty uncommon for people to put negative numbersinto int fields, since one of their biggest uses seems to be surrogate keys. > > I realize that this opens a can of worms with casting, but perhaps that can be kept under control by not doing any implicitcasting between int and uint... that just means that we'd have to be smart about casting from unknown, but hopefullythat's doable since we already have a similar concern with casting unknown to int2/4/8 vs numeric? I've wondered about it too, but it seems like too large a can of worms to open just to address this case. Returning the value as numeric is hardly a disaster; the user can always downcast to int8 if they really want, and as long as it's < 2^63 (which in practice it virtually always will be) it will work. It's not clear what the point of this is since for typical values numeric is going to take up less storage anyway (e.g. 1000001 is 7 bytes on disk as a numeric), not to mention that it only requires 4-byte alignment rather than 8-byte alignment, and probably no one does enough arithmetic with LSN values for any speed penalty to matter even slightly, but it should work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 6, 2011 at 19:06, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> On 06-12-2011 13:11, Robert Haas wrote: >>> On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>> I've been considering similar things, as you can find in the archives, >>>> but what I was thinking of was converting the number to just a plain >>>> bigint, then letting the user apply whatever arithmetic wanted at the >>>> SQL level. I never got around to acutally coding it, though. It could >>>> easily be extracted from your patch of course - and I think that's a >>>> more flexible approach. Is there some advantage to your method that >>>> I'm missing? >>> >>> I went so far as to put together an lsn data type. I didn't actually >>> get all that far with it, which is why I haven't posted it sooner, but >>> here's what I came up with. It's missing indexing support and stuff, >>> but that could be added if people like the approach. It solves this >>> problem by implementing -(lsn,lsn) => numeric (not int8, that can >>> overflow since it is not unsigned), which allows an lsn => numeric >>> conversion by just subtracting '0/0'::lsn. >>> >> Interesting approach. I don't want to go that far. If so, you want to change >> all of those functions that deal with LSNs and add some implicit conversion >> between text and lsn data types (for backward compatibility). As of int8, I'm As long as you have the conversion, you don't really need to change them, do you? It might be nice in some ways, but this is still a pretty internal operation, so I don't see it as critical. >> not aware of any modern plataform that int8 is not 64 bits. I'm not against >> numeric use; I'm just saying that int8 is sufficient. > > The point isn't that int8 might not be 64 bits - of course it has to > be 64 bits; that's why it's called int8 i.e. 8 bytes. The point is > that a large enough LSN, represented as an int8, will come out as a > negative values. int8 can only represent 2^63 *non-negative* values, > because one bit is reserved for sign. Doing it in numeric should be perfectly fine. The only real reason to pick int8 over in this context would be performance, but it's not like this is something that's going to be called in really performance critical paths... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On 20-12-2011 07:27, Magnus Hagander wrote: > On Tue, Dec 6, 2011 at 19:06, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira >> <euler@timbira.com> wrote: >>> On 06-12-2011 13:11, Robert Haas wrote: >>>> On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>>> I've been considering similar things, as you can find in the archives, >>>>> but what I was thinking of was converting the number to just a plain >>>>> bigint, then letting the user apply whatever arithmetic wanted at the >>>>> SQL level. I never got around to acutally coding it, though. It could >>>>> easily be extracted from your patch of course - and I think that's a >>>>> more flexible approach. Is there some advantage to your method that >>>>> I'm missing? >>>> >>>> I went so far as to put together an lsn data type. I didn't actually >>>> get all that far with it, which is why I haven't posted it sooner, but >>>> here's what I came up with. It's missing indexing support and stuff, >>>> but that could be added if people like the approach. It solves this >>>> problem by implementing -(lsn,lsn) => numeric (not int8, that can >>>> overflow since it is not unsigned), which allows an lsn => numeric >>>> conversion by just subtracting '0/0'::lsn. >>>> >>> Interesting approach. I don't want to go that far. If so, you want to change >>> all of those functions that deal with LSNs and add some implicit conversion >>> between text and lsn data types (for backward compatibility). As of int8, I'm > > As long as you have the conversion, you don't really need to change > them, do you? It might be nice in some ways, but this is still a > pretty internal operation, so I don't see it as critical. > For correctness, yes. At this point, my question is: do we want to support the lsn data type idea or a basic function that implements the difference between LSNs? -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On Tue, Dec 20, 2011 at 14:08, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 20-12-2011 07:27, Magnus Hagander wrote: >> On Tue, Dec 6, 2011 at 19:06, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira >>> <euler@timbira.com> wrote: >>>> On 06-12-2011 13:11, Robert Haas wrote: >>>>> On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>>>> I've been considering similar things, as you can find in the archives, >>>>>> but what I was thinking of was converting the number to just a plain >>>>>> bigint, then letting the user apply whatever arithmetic wanted at the >>>>>> SQL level. I never got around to acutally coding it, though. It could >>>>>> easily be extracted from your patch of course - and I think that's a >>>>>> more flexible approach. Is there some advantage to your method that >>>>>> I'm missing? >>>>> >>>>> I went so far as to put together an lsn data type. I didn't actually >>>>> get all that far with it, which is why I haven't posted it sooner, but >>>>> here's what I came up with. It's missing indexing support and stuff, >>>>> but that could be added if people like the approach. It solves this >>>>> problem by implementing -(lsn,lsn) => numeric (not int8, that can >>>>> overflow since it is not unsigned), which allows an lsn => numeric >>>>> conversion by just subtracting '0/0'::lsn. >>>>> >>>> Interesting approach. I don't want to go that far. If so, you want to change >>>> all of those functions that deal with LSNs and add some implicit conversion >>>> between text and lsn data types (for backward compatibility). As of int8, I'm >> >> As long as you have the conversion, you don't really need to change >> them, do you? It might be nice in some ways, but this is still a >> pretty internal operation, so I don't see it as critical. >> > For correctness, yes. > > At this point, my question is: do we want to support the lsn data type idea or > a basic function that implements the difference between LSNs? Personally I think a function is enough - it solves the only case that I've actually seen. But a datatype would be a more complete solution, of course - but it seems a bit of an overkill to me. Not really sure which way we should go - I was hoping somebody else would comment as well.. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: >> At this point, my question is: do we want to support the lsn data type idea or >> a basic function that implements the difference between LSNs? > Personally I think a function is enough - it solves the only case that > I've actually seen. But a datatype would be a more complete solution, > of course - but it seems a bit of an overkill to me. Not really sure > which way we should go - I was hoping somebody else would comment as > well.. I too think a datatype is overkill, if we're only planning on providing one function. Just emit the values as numeric and have done. regards, tom lane
On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I too think a datatype is overkill, if we're only planning on providing > one function. Are there any other functions we ought to provide? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/23/2011 10:05 AM, Tom Lane wrote: > Magnus Hagander<magnus@hagander.net> writes: >>> At this point, my question is: do we want to support the lsn data type idea or >>> a basic function that implements the difference between LSNs? >> Personally I think a function is enough - it solves the only case that >> I've actually seen. But a datatype would be a more complete solution, >> of course - but it seems a bit of an overkill to me. Not really sure >> which way we should go - I was hoping somebody else would comment as >> well.. > I too think a datatype is overkill, if we're only planning on providing > one function. Just emit the values as numeric and have done. > > +1. cheers andrew
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I too think a datatype is overkill, if we're only planning on providing >> one function. > Are there any other functions we ought to provide? Even if there are several, what exact advantage does a datatype offer over representing LSN values as numerics? It seems to me to be adding complication and extra code (I/O converters at least) for very little gain. regards, tom lane
On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I too think a datatype is overkill, if we're only planning on providing >>> one function. > >> Are there any other functions we ought to provide? > > Even if there are several, what exact advantage does a datatype offer > over representing LSN values as numerics? It seems to me to be adding > complication and extra code (I/O converters at least) for very little > gain. I guess I'm just constitutionally averse to labeling things as "text" when they really aren't. I do it all the time in Perl, of course, but in PostgreSQL we have strong data typing, and it seems like we might as well use it. Also, we've occasionally talked (in the light of Pavan's single-pass vacuum patch, for example) about needing to store LSNs in system catalogs; and we're certainly not going to want to do that as text. I'll admit that it's not 100% clear that anything like this will ever happen, though, so maybe it's premature to worry about it. I can see I'm in the minority on this one, though... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Even if there are several, what exact advantage does a datatype offer >> over representing LSN values as numerics? It seems to me to be adding >> complication and extra code (I/O converters at least) for very little >> gain. > I guess I'm just constitutionally averse to labeling things as "text" > when they really aren't. Er ... text? I thought the proposal was to use numeric. regards, tom lane
On Fri, Dec 23, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Even if there are several, what exact advantage does a datatype offer >>> over representing LSN values as numerics? It seems to me to be adding >>> complication and extra code (I/O converters at least) for very little >>> gain. > >> I guess I'm just constitutionally averse to labeling things as "text" >> when they really aren't. > > Er ... text? I thought the proposal was to use numeric. The proposal is to make a function that takes a text argument (which is really an LSN, but we choose to represent it as text) and returns numeric. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 6, 2011 at 1:19 PM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > Hi, > > A while ago when blogging about WAL [1], I noticed a function to deal with > xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and > after some questions during trainings and conferences I decided to translate > my shell script function in C. > > The attached patch implements the function pg_xlog_location_diff (bikeshed > colors are welcome). It calculates the difference between two given > transaction log locations. Now that we have pg_stat_replication view, it will > be easy to get the lag just passing columns as parameters. Also, the > monitoring tools could take advantage of it instead of relying on a fragile > routine to get the lag. > > I noticed that pg_xlogfile_name* functions does not sanity check the xrecoff > boundaries but that is material for another patch. > > > [1] http://eulerto.blogspot.com/2011/11/understanding-wal-nomenclature.html > [2] > http://www.depesz.com/index.php/2011/01/24/waiting-for-9-1-pg_stat_replication/ I think that this function is very useful. Can you add the patch into CommitFest 2012-1 ? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 14-01-2012 11:06, Fujii Masao wrote: > I think that this function is very useful. Can you add the patch into > CommitFest 2012-1 ? > Sure. But I must adjust the patch based on the thread comments (basically, numeric output). I have a new patch but need to test it before submitting it. I'll post this weekend. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On 01/14/2012 09:12 AM, Euler Taveira de Oliveira wrote: > But I must adjust the patch based on the thread comments (basically, > numeric output). I have a new patch but need to test it before submitting it. > I'll post this weekend. It's now at https://commitfest.postgresql.org/action/patch_view?id=776 listed as waiting on you right now. It's good to put patches into the CF application early. Helps planning, and gives a safety net against all sorts of things. We wouldn't want something this obviously useful to get kicked out if, for example, you lost your Internet connection over the weekend and then didn't technically qualify as having submitted it there before the deadline. As someone who sweated today for two hours when my power at home was turned off to install a new circuit breaker, I'm feeling particularly paranoid right now about that sort of thing here. The fact that you got some review feedback before the official CF start doesn't mean you can't be listed there right now. In fact, those are things I like to see tracked. Having links to all of the e-mail messages that were important turning points for a feature that changed during review is very helpful to reviewers and committers. And the easiest way to keep up with that is to start as early as possible: add it to the app right after the first patch submission. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On Sat, Jan 14, 2012 at 8:18 PM, Greg Smith <greg@2ndquadrant.com> wrote:
I agree.
So lets make it easy for the patch submitter to start the process. I propose that we have a page in the CF application where people can upload/attach the patch, and the app posts the patch to -hackers and uses the post URL to create the CF entry.
On 01/14/2012 09:12 AM, Euler Taveira de Oliveira wrote:It's now at https://commitfest.postgresql.org/action/patch_view?id=776 listed as waiting on you right now. It's good to put patches into the CF application early. Helps planning, and gives a safety net against all sorts of things. We wouldn't want something this obviously useful to get kicked out if, for example, you lost your Internet connection over the weekend and then didn't technically qualify as having submitted it there before the deadline. As someone who sweated today for two hours when my power at home was turned off to install a new circuit breaker, I'm feeling particularly paranoid right now about that sort of thing here.But I must adjust the patch based on the thread comments (basically,
numeric output). I have a new patch but need to test it before submitting it.
I'll post this weekend.
he patch
The fact that you got some review feedback before the official CF start doesn't mean you can't be listed there right now. In fact, those are things I like to see tracked. Having links to all of the e-mail messages that were important turning points for a feature that changed during review is very helpful to reviewers and committers. And the easiest way to keep up with that is to start as early as possible: add it to the app right after the first patch submission.
I agree.
So lets make it easy for the patch submitter to start the process. I propose that we have a page in the CF application where people can upload/attach the patch, and the app posts the patch to -hackers and uses the post URL to create the CF entry.
Regards,
--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
On 01/14/2012 10:49 PM, Gurjeet Singh wrote: > So lets make it easy for the patch submitter to start the process. I > propose that we have a page in the CF application where people can > upload/attach the patch, and the app posts the patch to -hackers and > uses the post URL to create the CF entry. > That would be nice, but there's at least two serious problems with it, which I would guess are both unsolvable without adding an unsupportable amount of work to the current PostgreSQL web team. First, it is technically risky for a web application hosted on postgresql.org to be e-mailing this list. There are some things in the infrastructure that do that already--I believe the pgsql-commiters list being driven from commits is the busiest such bot. But all of the ones that currently exist are either moderated, have a limited number of approved submitters, or both. If it were possible for a bot to create a postgresql.org community account, then trigger an e-mail to pgsql-hackers just by filling out a web form, I'd give it maybe six months before it has to be turned off for a bit--because there are thousands messages queued up once the first bored spammer figures that out. Securing web to e-mail gateways is a giant headache, and everyone working on the PostgreSQL infrastructure who might work on that is already overloaded with community volunteer work. There's an element of zero-sum game here: while this would provide some assistance to new contributors, the time to build and maintain the thing would be coming mainly out of senior contributors. I see the gain+risk vs. reward here skewed the wrong way. Second, e-mail provides some level of validation that patches being submitted are coming from the person they claim. We currently reject patches that are only shared with the community on the web, via places like github. The process around this mailing list tries to make it clear sending patches to here is a code submission under the PostgreSQL license. And e-mail nowadays keeps increasing the number of checks that confirm it's coming from the person it claims sent it. I can go check into the DKIM credentials your Gmail message to the list contained if I'd like, to help confirm it really came from your account. E-mail headers are certainly not perfectly traceable and audit-able, but they are far better than what you'd get from a web submission. Little audit trail there beyond "came from this IP address". One unicorn I would like to have here would give the CF app a database of recent e-mails to pgsql-hackers. I login to the CF app, click on "Add recent submission", and anything matching my e-mail address appears with a checkbox next to it. Click on the patch submissions, and then something like you described would happen. That would save me the annoying work around looking up message IDs so much. The role CF manager would benefit even more from infrastructure like that too. Something that listed all the recent e-mail messages for an existing submission, such that you could just click on the ones that you wanted added to the patch's e-mail history, would save me personally enough time that I could probably even justify writing it. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On Sun, Jan 15, 2012 at 05:44, Greg Smith <greg@2ndquadrant.com> wrote: > On 01/14/2012 10:49 PM, Gurjeet Singh wrote: >> >> So lets make it easy for the patch submitter to start the process. I >> propose that we have a page in the CF application where people can >> upload/attach the patch, and the app posts the patch to -hackers and uses >> the post URL to create the CF entry. >> > > That would be nice, but there's at least two serious problems with it, which > I would guess are both unsolvable without adding an unsupportable amount of > work to the current PostgreSQL web team. First, it is technically risky for > a web application hosted on postgresql.org to be e-mailing this list. There > are some things in the infrastructure that do that already--I believe the > pgsql-commiters list being driven from commits is the busiest such bot. But > all of the ones that currently exist are either moderated, have a limited > number of approved submitters, or both. It's not really a problem from that perspective, as long as it requires the user to be logged in. The mail would be sent from the users account, with that one as a sender, and thus be exposed to the same moderation rules as the rest of the list posts. > If it were possible for a bot to create a postgresql.org community account, > then trigger an e-mail to pgsql-hackers just by filling out a web form, I'd > give it maybe six months before it has to be turned off for a bit--because > there are thousands messages queued up once the first bored spammer figures Said bot can already use the bug report form *without* having to sign up for an account. Or said bot could submit news or events, which trigger an email to at least some lists, which hasn't bene done. It's supposedly not easy for a bot to sign up for a community account, since it requires you to have access to the email address it's registered on. If that doesn't work, it's a bug and needs to be fixed regardless. > that out. Securing web to e-mail gateways is a giant headache, and everyone > working on the PostgreSQL infrastructure who might work on that is already > overloaded with community volunteer work. There's an element of zero-sum We've already solved that problem for other situtations, and given how the infrastructure is built, that's fairly easy to replicate to another node. I think the bigger problem is "who'll write it". AFAIK, the CF app *itself* is even more person- and time-constrained to senior developers (Robert Haas only) than the infrastructure, and that's a bigger problem. There are already a bunch of things that are a lot simpler than this that has been pending on that one for well over half a year. > Second, e-mail provides some level of validation that patches being > submitted are coming from the person they claim. We currently reject > patches that are only shared with the community on the web, via places like > github. The process around this mailing list tries to make it clear sending > patches to here is a code submission under the PostgreSQL license. And > e-mail nowadays keeps increasing the number of checks that confirm it's > coming from the person it claims sent it. I can go check into the DKIM > credentials your Gmail message to the list contained if I'd like, to help > confirm it really came from your account. E-mail headers are certainly not I think DKIM was a bad example, because AFAIK our lists mangle DKIM and thus actually show them as *invalid* for at least the majority of messages... > One unicorn I would like to have here would give the CF app a database of > recent e-mails to pgsql-hackers. I login to the CF app, click on "Add > recent submission", and anything matching my e-mail address appears with a > checkbox next to it. Click on the patch submissions, and then something > like you described would happen. That would save me the annoying work > around looking up message IDs so much. That would be neat. And FWIW, I'd find it a lot more useful for the CF app to have the ability to post *reviews* in it, that would end up being properly threaded. The way it is now, half the reviewers create a *new* thread to post their reviews on, making it a PITA to keep track of those patches on the list at all, which somewhat takes away the whole idea of "mail being the primary way to track it". Not saying it's critical, but I'd put it a lot higher on the list than being able to post the initial patch. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On 01/15/2012 03:17 AM, Magnus Hagander wrote: > And FWIW, I'd find it a lot more useful for the CF app to have the > ability to post *reviews* in it, that would end up being properly > threaded. Next you'll be saying we should have some sort of web application to help with the whole review process, show the work on an integrated sort of Review Board or something. What crazy talk. My contribution toward patch review ease for this week is that peg is quite a bit smarter about referring to the correct part of the origin git repo now, when you've been working on a branch for a while then create a new one: https://github.com/gregs1104/peg Last week's was that I confirmed that on a Mac using Homebrew for package management, after "brew install postgresql" to get the dependencies in, you can then use peg to setup a PostgreSQL in your home directory for patch testing or development. Works fine out of the box, you just won't have things like all the PLs installed. Yes, I am aware I'm going at this bottom-up. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On Sun, Jan 15, 2012 at 09:37, Greg Smith <greg@2ndquadrant.com> wrote: > On 01/15/2012 03:17 AM, Magnus Hagander wrote: >> >> And FWIW, I'd find it a lot more useful for the CF app to have the >> ability to post *reviews* in it, that would end up being properly >> threaded. > > > Next you'll be saying we should have some sort of web application to help > with the whole review process, show the work on an integrated sort of Review > Board or something. What crazy talk. Well, it's early in the morning for being sunday, I blame that. > My contribution toward patch review ease for this week is that peg is quite > a bit smarter about referring to the correct part of the origin git repo > now, when you've been working on a branch for a while then create a new one: > https://github.com/gregs1104/peg Being able to refer to a git branch is one of those things that have been on the todo list for the cf app since pgcon last year... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Sun, Jan 15, 2012 at 3:45 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Sun, Jan 15, 2012 at 09:37, Greg Smith <greg@2ndquadrant.com> wrote: >> On 01/15/2012 03:17 AM, Magnus Hagander wrote: >>> And FWIW, I'd find it a lot more useful for the CF app to have the >>> ability to post *reviews* in it, that would end up being properly >>> threaded. >> >> Next you'll be saying we should have some sort of web application to help >> with the whole review process, show the work on an integrated sort of Review >> Board or something. What crazy talk. > > Well, it's early in the morning for being sunday, I blame that. > >> My contribution toward patch review ease for this week is that peg is quite >> a bit smarter about referring to the correct part of the origin git repo >> now, when you've been working on a branch for a while then create a new one: >> https://github.com/gregs1104/peg > > Being able to refer to a git branch is one of those things that have > been on the todo list for the cf app since pgcon last year... Do we have an actual written TODO list for the cf app somewhere? If not, I think creating one would be a good idea. I realize I've been remiss in addressing some of the things people want, but the lack of any centralized place where such items are collected doesn't make it simpler. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 16, 2012 at 18:57, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Jan 15, 2012 at 3:45 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Sun, Jan 15, 2012 at 09:37, Greg Smith <greg@2ndquadrant.com> wrote: >>> On 01/15/2012 03:17 AM, Magnus Hagander wrote: >>>> And FWIW, I'd find it a lot more useful for the CF app to have the >>>> ability to post *reviews* in it, that would end up being properly >>>> threaded. >>> >>> Next you'll be saying we should have some sort of web application to help >>> with the whole review process, show the work on an integrated sort of Review >>> Board or something. What crazy talk. >> >> Well, it's early in the morning for being sunday, I blame that. >> >>> My contribution toward patch review ease for this week is that peg is quite >>> a bit smarter about referring to the correct part of the origin git repo >>> now, when you've been working on a branch for a while then create a new one: >>> https://github.com/gregs1104/peg >> >> Being able to refer to a git branch is one of those things that have >> been on the todo list for the cf app since pgcon last year... > > Do we have an actual written TODO list for the cf app somewhere? If > not, I think creating one would be a good idea. I realize I've been > remiss in addressing some of the things people want, but the lack of > any centralized place where such items are collected doesn't make it > simpler. I don't think so - I've been keeping mine in your mailbox ;) A simple wiki page is probably enough - going for an actual tracker or anything seems vastly overkill... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On 1/14/12 8:44 PM, Greg Smith wrote: > Second, e-mail provides some level of validation that patches being > submitted are coming from the person they claim. We currently reject > patches that are only shared with the community on the web, via places > like github. The process around this mailing list tries to make it > clear sending patches to here is a code submission under the PostgreSQL > license. And e-mail nowadays keeps increasing the number of checks that > confirm it's coming from the person it claims sent it. I can go check > into the DKIM credentials your Gmail message to the list contained if > I'd like, to help confirm it really came from your account. E-mail > headers are certainly not perfectly traceable and audit-able, but they > are far better than what you'd get from a web submission. Little audit > trail there beyond "came from this IP address". Putting submitters aside, I have to say based on teaching people how to use the CF stuff on Thursday night that the process of submitting a review of a patch is VERY unintuitive, or in the words of one reviewer "astonishingly arcane". Summing up: 1. Log into CF. Claim the patch by editing it. 2. Write a review and email it to pgsql-hackers. 3. Dig the messageID out of your sent mail. 4. Add a comment to the patch, type "Review" with the messageID, and ideally a short summary comment of the review. 5. Edit the patch to change its status as well as to remove yourself as reviewer if you plan to do no further review. There are so many things wrong with this workflow I wouldn't know where to start. The end result, though, is that it strongly discourages the occasional reviewer by making the review process cumbersome and confusing. I'll also point out that the process for *applying* a patch, if you don't subscribe to hackers and keep archives around on your personal machine for months, is also very cumbersome and error-prone. Copy and paste from a web page? Really? Certainly we could spend the next 6 years incrementally improving the CF app "in our spare time". But maybe it might be a better thing to look at the code development tools which are already available? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Excerpts from Josh Berkus's message of lun ene 16 17:48:41 -0300 2012: > Putting submitters aside, I have to say based on teaching people how to > use the CF stuff on Thursday night that the process of submitting a > review of a patch is VERY unintuitive, or in the words of one reviewer > "astonishingly arcane". Summing up: > > 1. Log into CF. Claim the patch by editing it. > > 2. Write a review and email it to pgsql-hackers. > > 3. Dig the messageID out of your sent mail. > > 4. Add a comment to the patch, type "Review" with the messageID, and > ideally a short summary comment of the review. > > 5. Edit the patch to change its status as well as to remove yourself as > reviewer if you plan to do no further review. > > There are so many things wrong with this workflow I wouldn't know where > to start. Other than having to figure out Message-Ids, which most MUAs seem to hide as much as possible, is there anything here of substance? I mean, if getting a message-id from Gmail is all that complicated, please complain to Google. I mean, is email arcane? Surely not. Are summary lines arcane? Give me a break. So the only real complain point here is message-id, which normally people don't care about and don't even know they exist. So they have to learn about it. Let's keep in mind that pgsql-hackers email is our preferred form of communication. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> I mean, is email arcane? Surely not. Are summary lines arcane? Give > me a break. So the only real complain point here is message-id, which > normally people don't care about and don't even know they exist. So > they have to learn about it. The complaint is that the reviewer is expected to use two different and wholly incompatible methods of communication, each of which requires a separate registration, to post the review. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Mon, Jan 16, 2012 at 1:10 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > > Excerpts from Josh Berkus's message of lun ene 16 17:48:41 -0300 2012: > >> Putting submitters aside, I have to say based on teaching people how to >> use the CF stuff on Thursday night that the process of submitting a >> review of a patch is VERY unintuitive, or in the words of one reviewer >> "astonishingly arcane". Summing up: >> >> 1. Log into CF. Claim the patch by editing it. >> >> 2. Write a review and email it to pgsql-hackers. >> >> 3. Dig the messageID out of your sent mail. >> >> 4. Add a comment to the patch, type "Review" with the messageID, and >> ideally a short summary comment of the review. >> >> 5. Edit the patch to change its status as well as to remove yourself as >> reviewer if you plan to do no further review. >> >> There are so many things wrong with this workflow I wouldn't know where >> to start. > > Other than having to figure out Message-Ids, which most MUAs seem to > hide as much as possible, is there anything here of substance? I find it an annoyance, but can't get too worked up over it. > I mean, > if getting a message-id from Gmail is all that complicated, please > complain to Google. But after digging the message-id out of gmail and entering it into the commitfest app, the resulting link is broken because the email has not yet shown up in the archives. So now I have to wonder if I did something wrong, and keep coming back every few hours to see if will start working. > > I mean, is email arcane? Surely not. Are summary lines arcane? The way you have to set them is pretty arcane. Again, I can't get too worked over it, but if it were made simpler I'd be happier. Cheers, Jeff
Excerpts from Jeff Janes's message of lun ene 16 18:37:59 -0300 2012: > > I mean, > > if getting a message-id from Gmail is all that complicated, please > > complain to Google. > > But after digging the message-id out of gmail and entering it into the > commitfest app, the resulting link is broken because the email has not > yet shown up in the archives. So now I have to wonder if I did > something wrong, and keep coming back every few hours to see if will > start working. Hours? Unless a message is delayed for moderation, it should show up in archives within tem minutes. If you have problems finding emails after that period, by all means complain. Now that we've moved archives to a new host, perhaps we could rerun the archive script every two minutes instead of ten. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 01/16/2012 03:48 PM, Josh Berkus wrote: > 3. Dig the messageID out of your sent mail. > > 4. Add a comment to the patch, type "Review" with the messageID, and > ideally a short summary comment of the review. This is the time consuming part that would benefit the most from some automation. The message-id digging is an obvious sore spot, which is why I focused on improvements to eliminate so much of that first in my suggestions. The problem is that we don't actually want every message sent to the list on a thread to appear on the CF summary, and writing that short summary content is an important step. Archived messages deemed notable enough that someone linked the two are the only ones that appear in the patch history. That makes it possible to come up to speed on the most interesting history points of a patch in a reasonable period of time--even if you missed the earlier discussion. I think any of the other alternatives we might adopt would end up associating all of the e-mail history around a patch. That's the firehose, and spraying the CF app with it makes the whole thing a lot less useful. I don't think this is an unsolvable area to improve. It's been stuck behind the major postgresql.org site overhaul, which is done now. Adding some web service style APIs to probe the archives for message IDs by a) ancestor and b) author would make it possible to sand off a whole lot of rough edges here. While it's annoying in its current form, doing all my work based on message IDs has been a huge improvement over the old approach, where URLs into the archives were date based and not always permanent. > I'll also point out that the process for *applying* a patch, if you > don't subscribe to hackers and keep archives around on your personal > machine for months, is also very cumbersome and error-prone. Copy and > paste from a web page? Really? The most reasonable answer to this is for people to publish a git repo URL in addition to the "official" submission of changes to the list in patch form. And momentum toward doing that just keeps going up, even among longer term contributors who weren't git advocates at all a year during the transition. I nudged Simon that way and he's pushing branches for major patches but not small ones yet, it looks like Andrew fully embraced bitbucket recently, etc. We're 16 months into git adoption. I'm pretty happy with how well that's going. We don't need to add infrastructure to enable people to push code to github and link to their branch comparison repo viewer as a way to view the patch; that's already available to anyone who wants is. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On 01/16/2012 05:25 PM, Greg Smith wrote: > > The most reasonable answer to this is for people to publish a git repo > URL in addition to the "official" submission of changes to the list in > patch form. And momentum toward doing that just keeps going up, even > among longer term contributors who weren't git advocates at all a year > during the transition. I nudged Simon that way and he's pushing > branches for major patches but not small ones yet, it looks like > Andrew fully embraced bitbucket recently, etc. > > If we're going to do that, the refspec to be pulled needs to be a tag, I think, not just a branch, and people would have to get into the habit of tagging commits and explicitly pushing tags. I probably should be doing that, and it is now built into the buildfarm client release mechanism, but I usually don't when just publishing dev work. Guess I need to start. I'll probably use tag names like branch-YYYYMMDDHHMM. I certainly like the idea of just being able to pull in a tag from a remote instead of applying a patch. (BTW, I use both bitbucket and github. They both have advantages.) cheers andrew
Excerpts from Greg Smith's message of lun ene 16 19:25:50 -0300 2012: > On 01/16/2012 03:48 PM, Josh Berkus wrote: > > I'll also point out that the process for *applying* a patch, if you > > don't subscribe to hackers and keep archives around on your personal > > machine for months, is also very cumbersome and error-prone. Copy and > > paste from a web page? Really? > > The most reasonable answer to this is for people to publish a git repo > URL in addition to the "official" submission of changes to the list in > patch form. It's expected that we'll get a more reasonable interface to attachments, one that will allow you to download patches separately. (Currently, attachments that have mime types other than text/plain are already downloadable separately). -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 20 December 2011 10:27, Magnus Hagander <magnus@hagander.net> wrote: > Doing it in numeric should be perfectly fine. The only real reason to > pick int8 over in this context would be performance, but it's not like > this is something that's going to be called in really performance > critical paths... FYI, my group commit patch has a little macro, in the spirit of XLByteAdvance, to get the delta between two LSNs in bytes as an uint64. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On mån, 2012-01-16 at 17:25 -0500, Greg Smith wrote: > The most reasonable answer to this is for people to publish a git repo > URL in addition to the "official" submission of changes to the list in > patch form. Note that the original complaint was that for the occasional reviewer, the current system takes at least 5 partially redundant steps in two different systems. I doubt that adding a third system and more partially redundant steps it going to help that. I don't have anything against the general idea, but it won't address the original point.
On 01/16/2012 05:40 PM, Alvaro Herrera wrote: > It's expected that we'll get a more reasonable interface to attachments, > one that will allow you to download patches separately. (Currently, > attachments that have mime types other than text/plain are already > downloadable separately). Are you really sure about that? My recent JSON patch is at <http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net>. I don't see any download link for the patch there, yet my mailer set the attachment type to text/x-patch, not text/plain. cheers andrew
On 17/01/2012 17:50, Alvaro Herrera wrote: > > Excerpts from Matteo Beccati's message of mar ene 17 12:33:27 -0300 2012: >> My proof of concept archive for the hackers ML site is still online, in >> case anyone has trouble downloading the patches or just wants to have >> the full thread handy. > > I was going to ping you about this, because I tried it when I wrote this > message and it got stuck waiting for response. Hmm, works for me, e.g. the recently cited message: http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net > Now that we've migrated the website, it's time to get back to our > conversations about migrating archives to your stuff too. How confident > with Django are you? I've never wrote a line of Python in my life, so someone else should work on porting the web part, I'm afraid... Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
On 17/01/2012 18:10, Matteo Beccati wrote: > On 17/01/2012 17:50, Alvaro Herrera wrote: >> >> Excerpts from Matteo Beccati's message of mar ene 17 12:33:27 -0300 2012: >>> My proof of concept archive for the hackers ML site is still online, in >>> case anyone has trouble downloading the patches or just wants to have >>> the full thread handy. >> >> I was going to ping you about this, because I tried it when I wrote this >> message and it got stuck waiting for response. > > Hmm, works for me, e.g. the recently cited message: > > http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net Erm... I meant http://archives.beccati.org/message-id/4F12F9E5.3090801@dunslane.net which redirects to: http://archives.beccati.org/pgsql-hackers/message/305925 for me. Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
Andrew Dunstan <andrew@dunslane.net> writes: > On 01/16/2012 05:40 PM, Alvaro Herrera wrote: >> It's expected that we'll get a more reasonable interface to attachments, >> one that will allow you to download patches separately. (Currently, >> attachments that have mime types other than text/plain are already >> downloadable separately). > Are you really sure about that? My recent JSON patch is at > <http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net>. > I don't see any download link for the patch there, yet my mailer set the > attachment type to text/x-patch, not text/plain. Yeah, AFAICT the archives treat text/x-patch the same as text/plain. I tend to send stuff that way if I mean it primarily to be read in the email. If I'm thinking people will download and apply it, it's better to gzip the patch and pick a mime type appropriate to that, because that makes it much easier to pull the patch off the archives at need, at the cost that you can't just eyeball it in your mail reader. Anyway, I agree with the general tenor of this thread that it'd be nice to reduce the impedance mismatches a bit. Don't have any great ideas about specific ways to do that. regards, tom lane
On 01/17/2012 11:50 AM, Peter Eisentraut wrote: > On mån, 2012-01-16 at 17:25 -0500, Greg Smith wrote: >> The most reasonable answer to this is for people to publish a git repo >> URL in addition to the "official" submission of changes to the list in >> patch form. > Note that the original complaint was that for the occasional reviewer, > the current system takes at least 5 partially redundant steps in two > different systems. I doubt that adding a third system and more > partially redundant steps it going to help that. Publishing the submission via git is an extra step for the patch submitter. If that happens, the reviewer can test just be cloning that, instead of first closing the PostgreSQL one then applying the patch. It removes the "how do I fish the patch out of the archives?" problem from the reviewer's side of things. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Excerpts from Tom Lane's message of mar ene 17 14:24:05 -0300 2012: > > Andrew Dunstan <andrew@dunslane.net> writes: > > On 01/16/2012 05:40 PM, Alvaro Herrera wrote: > >> It's expected that we'll get a more reasonable interface to attachments, > >> one that will allow you to download patches separately. (Currently, > >> attachments that have mime types other than text/plain are already > >> downloadable separately). > > > Are you really sure about that? My recent JSON patch is at > > <http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net>. > > I don't see any download link for the patch there, yet my mailer set the > > attachment type to text/x-patch, not text/plain. > > Yeah, AFAICT the archives treat text/x-patch the same as text/plain. Right, maybe it's text/* or something like that. > I tend to send stuff that way if I mean it primarily to be read in the > email. If I'm thinking people will download and apply it, it's better > to gzip the patch and pick a mime type appropriate to that, because that > makes it much easier to pull the patch off the archives at need, at the > cost that you can't just eyeball it in your mail reader. Maybe we could find a way to convince Mhonarc to present links to download all mime parts separately, not only those that are undisplayable. > Anyway, I agree with the general tenor of this thread that it'd be nice > to reduce the impedance mismatches a bit. Don't have any great ideas > about specific ways to do that. I'm hopeful that the migration to the Archivopteryx stuff by Matteo will improve things a bit. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Matteo Beccati's message of mar ene 17 12:33:27 -0300 2012: > > On 16/01/2012 23:40, Alvaro Herrera wrote: > > > > Excerpts from Greg Smith's message of lun ene 16 19:25:50 -0300 2012: > >> On 01/16/2012 03:48 PM, Josh Berkus wrote: > > > >>> I'll also point out that the process for *applying* a patch, if you > >>> don't subscribe to hackers and keep archives around on your personal > >>> machine for months, is also very cumbersome and error-prone. Copy and > >>> paste from a web page? Really? > >> > >> The most reasonable answer to this is for people to publish a git repo > >> URL in addition to the "official" submission of changes to the list in > >> patch form. > > > > It's expected that we'll get a more reasonable interface to attachments, > > one that will allow you to download patches separately. (Currently, > > attachments that have mime types other than text/plain are already > > downloadable separately). > > My proof of concept archive for the hackers ML site is still online, in > case anyone has trouble downloading the patches or just wants to have > the full thread handy. I was going to ping you about this, because I tried it when I wrote this message and it got stuck waiting for response. Now that we've migrated the website, it's time to get back to our conversations about migrating archives to your stuff too. How confident with Django are you? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 16/01/2012 23:40, Alvaro Herrera wrote: > > Excerpts from Greg Smith's message of lun ene 16 19:25:50 -0300 2012: >> On 01/16/2012 03:48 PM, Josh Berkus wrote: > >>> I'll also point out that the process for *applying* a patch, if you >>> don't subscribe to hackers and keep archives around on your personal >>> machine for months, is also very cumbersome and error-prone. Copy and >>> paste from a web page? Really? >> >> The most reasonable answer to this is for people to publish a git repo >> URL in addition to the "official" submission of changes to the list in >> patch form. > > It's expected that we'll get a more reasonable interface to attachments, > one that will allow you to download patches separately. (Currently, > attachments that have mime types other than text/plain are already > downloadable separately). My proof of concept archive for the hackers ML site is still online, in case anyone has trouble downloading the patches or just wants to have the full thread handy. All you need to do is to swap postgresql.org with beccati.org in the "message-id" link: http://archives.postgresql.org/message-id/1320343602-sup-2290@alvh.no-ip.org -> http://archives.beccati.org/message-id/1320343602-sup-2290@alvh.no-ip.org Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
Excerpts from Andrew Dunstan's message of mar ene 17 13:50:20 -0300 2012: > > On 01/16/2012 05:40 PM, Alvaro Herrera wrote: > > It's expected that we'll get a more reasonable interface to attachments, > > one that will allow you to download patches separately. (Currently, > > attachments that have mime types other than text/plain are already > > downloadable separately). > > > Are you really sure about that? My recent JSON patch is at > <http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net>. > I don't see any download link for the patch there, yet my mailer set the > attachment type to text/x-patch, not text/plain. I tweaked the Mhonarc config and now this attachment (as well as many others) is shown as a downloadable link. Please give it a look. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Alvaro Herrera's message of mar ene 17 22:23:13 -0300 2012: > Excerpts from Andrew Dunstan's message of mar ene 17 13:50:20 -0300 2012: > > > > On 01/16/2012 05:40 PM, Alvaro Herrera wrote: > > > It's expected that we'll get a more reasonable interface to attachments, > > > one that will allow you to download patches separately. (Currently, > > > attachments that have mime types other than text/plain are already > > > downloadable separately). > > > > > > Are you really sure about that? My recent JSON patch is at > > <http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net>. > > I don't see any download link for the patch there, yet my mailer set the > > attachment type to text/x-patch, not text/plain. > > I tweaked the Mhonarc config and now this attachment (as well as many > others) is shown as a downloadable link. Please give it a look. Hm, I notice it works almost every patch I've checked, except the ones from Tom such as this one: http://archives.postgresql.org/message-id/4643.1326776814@sss.pgh.pa.us The problem is that this one doesn't have the Content-Disposition: attachment line in the MIME header. I don't know what we can do about it. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Greg Smith <greg@2ndQuadrant.com> writes: > One unicorn I would like to have here would give the CF app a database > of recent e-mails to pgsql-hackers. I login to the CF app, click on > "Add recent submission", and anything matching my e-mail address > appears with a checkbox next to it. Click on the patch submissions, > and then something like you described would happen. That would save > me the annoying work around looking up message IDs so much. Another idea: introduce some simple tag system in mails sent to -hackers to be treated specially, e.g: @fest add-to-current to add new patch to the commit fest currently in progress, or @fest add-to-next to add it to the next scheduled fest. Attribute your mail with @fest comment COMMENT TEXT or @fest comment <<EOF ... EOF to add a (long) comment, ditto for "patch" and "review". How does that sound? -- Alex
On 01/19/2012 12:59 PM, Alex Shulgin wrote: > Greg Smith<greg@2ndQuadrant.com> writes: > >> One unicorn I would like to have here would give the CF app a database >> of recent e-mails to pgsql-hackers. I login to the CF app, click on >> "Add recent submission", and anything matching my e-mail address >> appears with a checkbox next to it. Click on the patch submissions, >> and then something like you described would happen. That would save >> me the annoying work around looking up message IDs so much. > Another idea: introduce some simple tag system in mails sent to -hackers > to be treated specially, e.g: > > @fest add-to-current > > to add new patch to the commit fest currently in progress, or > > @fest add-to-next > > to add it to the next scheduled fest. > > Attribute your mail with > > @fest comment COMMENT TEXT > > or > > @fest comment<<EOF > ... > EOF > > to add a (long) comment, ditto for "patch" and "review". > > How does that sound? > Like a recipe for something that requires constant fixups, to be honest. Seriously, adding something to the CF isn't *that* hard. I like Greg's idea of a list of recent emails that you could choose from. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 01/19/2012 12:59 PM, Alex Shulgin wrote: > >> Another idea: introduce some simple tag system in mails sent to -hackers >> to be treated specially, e.g: >> >> @fest add-to-current >> >> to add new patch to the commit fest currently in progress, or >> >> @fest add-to-next >> >> to add it to the next scheduled fest. >> >> Attribute your mail with >> >> @fest comment COMMENT TEXT >> >> or >> >> @fest comment<<EOF >> ... >> EOF >> >> to add a (long) comment, ditto for "patch" and "review". >> >> How does that sound? >> > > Like a recipe for something that requires constant fixups, to be honest. > > Seriously, adding something to the CF isn't *that* hard. I like Greg's > idea of a list of recent emails that you could choose from. I've just added a comment about a patch and it took me to: a. Login to commitfest app b. Locate the patch and review I was replying to c. Fetch archives thread index, refresh the index page for ~10 minutes to see my reply appear d. Copy message id and finally register comment in the commitfest app (IIRC, something close to that was already described in this thread) With the proposed approach it would only take me to include @fest comment "Patch applies cleanly" and possibly @fest status Needs Review to update the patch status and that'd be it. -- Alex PS: yes, I could just copy message id from the sent mail in my MUA, but I like to make sure links I post aren't broke, so still I'll need to wait until archives catches up to double check.
Excerpts from Alex Shulgin's message of jue ene 19 15:41:54 -0300 2012: > PS: yes, I could just copy message id from the sent mail in my MUA, but > I like to make sure links I post aren't broke, so still I'll need to > wait until archives catches up to double check. I find this a bad excuse. If you're a pgsql-hackers regular, then you already know your posts are going to show up with the correct message-id. The links might be broken for the next 10 minutes, but links that stay broken for a longer period than that should be rare. Surely you don't change your MUA once a month or anything. I know I don't waste time waiting for my posts to show up in the archives before adding links to the CF app. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 01/19/2012 01:41 PM, Alex Shulgin wrote: > > With the proposed approach it would only take me to include > > @fest comment "Patch applies cleanly" > > and possibly > > @fest status Needs Review > > to update the patch status and that'd be it. > It will be easy if you get it right. My point was that it's way too easy to get it wrong. cheers andrew
On 23-12-2011 12:05, Tom Lane wrote: > I too think a datatype is overkill, if we're only planning on providing > one function. Just emit the values as numeric and have done. > Here it is. Output changed to numeric. While the output was int8 I could use pg_size_pretty but now I couldn't. I attached another patch that implements pg_size_pretty(numeric). -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > The problem is that this one doesn't have the > Content-Disposition: attachment > line in the MIME header. I don't know what we can do about it. It's sent with an inline attachment AFAICT, some MA will make it easy to process the attachment and some others will just make the content appear within the mail. It seems the vast majority falls into the unhelpful second category. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Alex Shulgin <ash@commandprompt.com> writes: > Another idea: introduce some simple tag system in mails sent to -hackers > to be treated specially, e.g: [...] > How does that sound? Very much like what debbugs does already. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sun, Jan 22, 2012 at 1:13 AM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 23-12-2011 12:05, Tom Lane wrote: >> I too think a datatype is overkill, if we're only planning on providing >> one function. Just emit the values as numeric and have done. >> > Here it is. Output changed to numeric. Thanks! When I compiled the source with xlogdiff.patch, I got the following warnings. xlogfuncs.c:511:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat] xlogfuncs.c:511:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 4 has type 'uint64 *' [-Wformat] xlogfuncs.c:515:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat] xlogfuncs.c:515:2: warning: format '%lX' expects argument of type 'long unsigned int *', but argument 4 has type 'uint64 *' [-Wformat] xlogfuncs.c:524:3: warning: format '%lX' expects argument of type 'long unsigned int', but argument 2 has type 'uint64' [-Wformat] xlogfuncs.c:528:3: warning: format '%lX' expects argument of type 'long unsigned int', but argument 2 has type 'uint64' [-Wformat] When I tested the patch, I got the following error: postgres=# SELECT pg_current_xlog_location();pg_current_xlog_location --------------------------0/2000074 (1 row) postgres=# SELECT pg_xlog_location_diff('0/2000074', '0/2000074'); ERROR: xrecoff "2000074" is out of valid range, 0..A4A534C In func.sgml <para> The functions shown in <xref linkend="functions-admin-backup-table"> assist in making on-line backups. These functions cannot be executed during recovery. </para> Since pg_xlog_location_diff() can be executed during recovery, the above needs to be updated. > While the output was int8 I could use > pg_size_pretty but now I couldn't. I attached another patch that implements > pg_size_pretty(numeric). I agree it's necessary. * Note: every entry in pg_proc.h is expected to have a DESCR() comment,* except for functions that implement pg_operator.hoperators and don't* have a good reason to be called directly rather than via the operator. According to the above source code comment in pg_proc.h, ISTM pg_size_pretty() for numeric also needs to have its own DESCR(). + buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size))); + result = strcat(buf, " kB"); According to "man strcat", the dest string must have enough space for the result. "buf" has enough space? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 26-01-2012 06:19, Fujii Masao wrote: Thanks for your review. Comments below. > When I compiled the source with xlogdiff.patch, I got the following warnings. > > xlogfuncs.c:511:2: warning: format '%lX' expects argument of type > 'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat] > What is your compiler? I'm using gcc 4.6.2. I refactored the patch so I'm now using XLogRecPtr and %X. > postgres=# SELECT pg_xlog_location_diff('0/2000074', '0/2000074'); > ERROR: xrecoff "2000074" is out of valid range, 0..A4A534C > Ugh? I can't reproduce that. It seems to be related to long int used by the prior version. > Since pg_xlog_location_diff() can be executed during recovery, > the above needs to be updated. > Fixed. >> While the output was int8 I could use >> pg_size_pretty but now I couldn't. I attached another patch that implements >> pg_size_pretty(numeric). > I realized that it collides with the pg_size_pretty(int8) if we don't specify a type. Hence, I decided to drop the pg_size_pretty(int8) in favor of pg_size_pretty(numeric). It is slower than the former but it is not a performance critical function. > According to the above source code comment in pg_proc.h, ISTM > pg_size_pretty() for numeric also needs to have its own DESCR(). > Fixed. > According to "man strcat", the dest string must have enough space for > the result. > "buf" has enough space? > Ops. Fixed. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachment
On Wed, Feb 8, 2012 at 2:29 AM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 26-01-2012 06:19, Fujii Masao wrote: > > Thanks for your review. Comments below. > >> When I compiled the source with xlogdiff.patch, I got the following warnings. >> >> xlogfuncs.c:511:2: warning: format '%lX' expects argument of type >> 'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat] >> > What is your compiler? I'm using gcc 4.6.2. I refactored the patch so I'm now > using XLogRecPtr and %X. gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) $ uname -a Linux hermes 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 15:59:53 UTC 2012 i686 i686 i386 GNU/Linux In the updated version of the patch, I got no warnings at the compile time. But initdb failed because the OID which you assigned to pg_xlog_location_diff() has already been used for other function. So you need to update pg_proc.h. >> postgres=# SELECT pg_xlog_location_diff('0/2000074', '0/2000074'); >> ERROR: xrecoff "2000074" is out of valid range, 0..A4A534C >> > Ugh? I can't reproduce that. It seems to be related to long int used by the > prior version. Maybe. But another problem happened. When I changed pg_proc.h so that the unused OID was assigned to pg_xlog_location_diff(), and executed the above again, I encountered the segmentation fault: LOG: server process (PID 14384) was terminated by signal 11: Segmentation fault DETAIL: Failed process was running: SELECT pg_xlog_location_diff('0/2000074', '0/2000074'); LOG: terminating any other active server processes ISTM that the cause is that int8_numeric() is executed for uint32 value. We should use int4_numeric(), instead? >>> While the output was int8 I could use >>> pg_size_pretty but now I couldn't. I attached another patch that implements >>> pg_size_pretty(numeric). >> > I realized that it collides with the pg_size_pretty(int8) if we don't specify > a type. Hence, I decided to drop the pg_size_pretty(int8) in favor of > pg_size_pretty(numeric). It is slower than the former but it is not a > performance critical function. I'm OK with this. -DATA(insert OID = 2288 ( pg_size_pretty PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 25 "20" _null_ _null_ _null_ _null_ pg_size_pretty _null_ _null_ _null_ )); -DESCR("convert a long int to a human readable text using size units"); +DATA(insert OID = 3158 ( pg_size_pretty PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 25 "1700" _null_ _null_ _null_ _null_ pg_size_pretty _null_ _null_ _null_ )); +DESCR("convert a numeric to a human readable text using size units"); Why OID needs to be reassigned? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 08-02-2012 09:35, Fujii Masao wrote: Fujii, new patch attached. Thanks for your tests. > But another problem happened. When I changed pg_proc.h so that the unused > OID was assigned to pg_xlog_location_diff(), and executed the above again, > I encountered the segmentation fault: > I reproduced the problems in my old 32-bit laptop. I fixed it casting to int64. I also updated the duplicated OID. > Why OID needs to be reassigned? > There isn't a compelling reason. It is just a way to say: "hey, it is another function with the same old name". I'll not attach another version for pg_size_pretty because it is a matter of updating a duplicated OID. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachment
On Fri, Feb 10, 2012 at 7:00 AM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 08-02-2012 09:35, Fujii Masao wrote: > > Fujii, new patch attached. Thanks for your tests. Thanks for the new patch! >> But another problem happened. When I changed pg_proc.h so that the unused >> OID was assigned to pg_xlog_location_diff(), and executed the above again, >> I encountered the segmentation fault: >> > I reproduced the problems in my old 32-bit laptop. I fixed it casting to > int64. I also updated the duplicated OID. Yep, in the updated patch, I could confirm that the function works fine without any error in my machine. The patch looks fine to me except the following minor comments: In the document, it's better to explain clearly that the function subtracts the second argument from the first. - These functions cannot be executed during recovery. + These functions cannot be executed during recovery (except + <function>pg_xlog_location_diff</function>). + <function>pg_xlog_location_diff</> calculates the difference in bytes + between two transaction log locations. It can be used with + <structname>pg_stat_replication</structname> or some functions shown in + <xref linkend="functions-admin-backup-table"> to get the replication lag. Very minor comment: you should use spaces rather than a tab to indent each line. >> Why OID needs to be reassigned? >> > There isn't a compelling reason. It is just a way to say: "hey, it is another > function with the same old name". > > I'll not attach another version for pg_size_pretty because it is a matter of > updating a duplicated OID. Okay, I reviewed the previous patch again. That looks fine to me. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Fri, Feb 10, 2012 at 09:32, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Feb 10, 2012 at 7:00 AM, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> On 08-02-2012 09:35, Fujii Masao wrote: >> >> Fujii, new patch attached. Thanks for your tests. > > Thanks for the new patch! > >>> But another problem happened. When I changed pg_proc.h so that the unused >>> OID was assigned to pg_xlog_location_diff(), and executed the above again, >>> I encountered the segmentation fault: >>> >> I reproduced the problems in my old 32-bit laptop. I fixed it casting to >> int64. I also updated the duplicated OID. > > Yep, in the updated patch, I could confirm that the function works fine without > any error in my machine. The patch looks fine to me except the following minor > comments: I started working on this one to commit it, and came up with a few things more. Do we even *need* the validate_xlog_location() function? If we just remove those calls, won't we still catch all the incorrectly formatted ones in the errors of the sscanf() calls? Or am I too deep into weekend-mode and missing something obvious? I've also removed tabs in the documentation, fixed the merge confllict in pg_proc.h that happened during the wait, and fixed some indentation (updated patch with these changes attached). But I'm going to hold off committing it until someone confirms I'm not caught too deeply in weekend-mode and am missing something obvious in the comment above about validate_xlog_location. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
On 25-02-2012 09:23, Magnus Hagander wrote: > Do we even *need* the validate_xlog_location() function? If we just > remove those calls, won't we still catch all the incorrectly formatted > ones in the errors of the sscanf() calls? Or am I too deep into > weekend-mode and missing something obvious? > sscanf() is too fragile for input sanity check. Try pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing that function if you protect xlog location input from silly users. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On Sun, Feb 26, 2012 at 8:53 AM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 25-02-2012 09:23, Magnus Hagander wrote: >> Do we even *need* the validate_xlog_location() function? If we just >> remove those calls, won't we still catch all the incorrectly formatted >> ones in the errors of the sscanf() calls? Or am I too deep into >> weekend-mode and missing something obvious? >> > sscanf() is too fragile for input sanity check. Try > pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing > that function if you protect xlog location input from silly users. After this patch will have been committed, it would be better to change pg_xlogfile_name() and pg_xlogfile_name_offset() so that they use the validate_xlog_location() function to validate the input. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Sun, Feb 26, 2012 at 00:53, Euler Taveira de Oliveira <euler@timbira.com> wrote: > On 25-02-2012 09:23, Magnus Hagander wrote: >> Do we even *need* the validate_xlog_location() function? If we just >> remove those calls, won't we still catch all the incorrectly formatted >> ones in the errors of the sscanf() calls? Or am I too deep into >> weekend-mode and missing something obvious? >> > sscanf() is too fragile for input sanity check. Try > pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing > that function if you protect xlog location input from silly users. Ah, good point. No, that's the reason I was missing :-) Patch applied, thanks! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Tue, Feb 28, 2012 at 07:21, Fujii Masao <masao.fujii@gmail.com> wrote: > On Sun, Feb 26, 2012 at 8:53 AM, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> On 25-02-2012 09:23, Magnus Hagander wrote: >>> Do we even *need* the validate_xlog_location() function? If we just >>> remove those calls, won't we still catch all the incorrectly formatted >>> ones in the errors of the sscanf() calls? Or am I too deep into >>> weekend-mode and missing something obvious? >>> >> sscanf() is too fragile for input sanity check. Try >> pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing >> that function if you protect xlog location input from silly users. > > After this patch will have been committed, it would be better to change > pg_xlogfile_name() and pg_xlogfile_name_offset() so that they use > the validate_xlog_location() function to validate the input. And I've done this part as well. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Sun, Mar 4, 2012 at 8:26 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Sun, Feb 26, 2012 at 00:53, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> On 25-02-2012 09:23, Magnus Hagander wrote: >>> Do we even *need* the validate_xlog_location() function? If we just >>> remove those calls, won't we still catch all the incorrectly formatted >>> ones in the errors of the sscanf() calls? Or am I too deep into >>> weekend-mode and missing something obvious? >>> >> sscanf() is too fragile for input sanity check. Try >> pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing >> that function if you protect xlog location input from silly users. > > Ah, good point. No, that's the reason I was missing :-) > > Patch applied, thanks! Thanks for committing the patch! Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint) with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff()) succeeds. It's also worth committing this patch? http://archives.postgresql.org/message-id/4F315F6C.8030700@timbira.com Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao <masao.fujii@gmail.com> writes: > Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint) > with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff()) > succeeds. It's also worth committing this patch? Why would it be useful to use pg_size_pretty on xlog locations? -1 because of the large expense of bigint->numeric->whatever conversion that would be added to existing uses. regards, tom lane
On Fri, Mar 9, 2012 at 9:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint) >> with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff()) >> succeeds. It's also worth committing this patch? > > Why would it be useful to use pg_size_pretty on xlog locations? > -1 because of the large expense of bigint->numeric->whatever conversion > that would be added to existing uses. The point is that it would be useful to use it on the difference between two xlog locations, but that is a numeric value, not int8, because of signedness issues. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
I wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint) >> with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff()) >> succeeds. It's also worth committing this patch? > Why would it be useful to use pg_size_pretty on xlog locations? > -1 because of the large expense of bigint->numeric->whatever conversion > that would be added to existing uses. Actually ... now that I look at it, isn't it completely bogus to be using numeric for the result of pg_xlog_location_diff? There's no way for the difference of two xlog locations to be anywhere near as wide as 64 bits. That'd only be possible if XLogFileSize exceeded 1GB, which we don't let it get anywhere near. regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Mar 9, 2012 at 9:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Why would it be useful to use pg_size_pretty on xlog locations? > The point is that it would be useful to use it on the difference > between two xlog locations, Um, that is exactly the claim I was questioning. Why is that useful? > but that is a numeric value, not int8, because of signedness issues. See my followup --- this statement appears factually incorrect, whatever you may feel about the usefulness issue. regards, tom lane
On Fri, Mar 9, 2012 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Fujii Masao <masao.fujii@gmail.com> writes: >>> Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint) >>> with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff()) >>> succeeds. It's also worth committing this patch? > >> Why would it be useful to use pg_size_pretty on xlog locations? >> -1 because of the large expense of bigint->numeric->whatever conversion >> that would be added to existing uses. > > Actually ... now that I look at it, isn't it completely bogus to be > using numeric for the result of pg_xlog_location_diff? There's no way > for the difference of two xlog locations to be anywhere near as wide as > 64 bits. That'd only be possible if XLogFileSize exceeded 1GB, which we > don't let it get anywhere near. rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0');pg_xlog_location_diff ----------------------- 18374686475393433600 (1 row) rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0')::int8; ERROR: bigint out of range STATEMENT: select pg_xlog_location_diff('ffffffff/0', '0/0')::int8; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Mar 9, 2012 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Actually ... now that I look at it, isn't it completely bogus to be >> using numeric for the result of pg_xlog_location_diff? > rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0')::int8; > ERROR: bigint out of range Oh ... I see my mistake. I was looking at this: /* * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2 */ and confusing XLogFileSize with XLogSegSize. Not the best choice of names. regards, tom lane
On Fri, Mar 9, 2012 at 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Mar 9, 2012 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Actually ... now that I look at it, isn't it completely bogus to be >>> using numeric for the result of pg_xlog_location_diff? > >> rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0')::int8; >> ERROR: bigint out of range > > Oh ... I see my mistake. I was looking at this: > > /* > * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2 > */ > > and confusing XLogFileSize with XLogSegSize. Not the best choice of > names. Yeah, the use of XLogFile to mean something other than, well a file in the xlog, is greatly annoying.. I guess we could change it, but it goes pretty deep in the system so it's not a small change... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint) >> with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff()) >> succeeds. It's also worth committing this patch? > > Why would it be useful to use pg_size_pretty on xlog locations? > -1 because of the large expense of bigint->numeric->whatever conversion > that would be added to existing uses. Given the expense, perhaps we need to different (overloaded) functions instead? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Why would it be useful to use pg_size_pretty on xlog locations? >> -1 because of the large expense of bigint->numeric->whatever conversion >> that would be added to existing uses. > Given the expense, perhaps we need to different (overloaded) functions instead? That would be a workable solution, but I continue to not believe that this is useful enough to be worth the trouble. regards, tom lane
On Fri, Mar 9, 2012 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Why would it be useful to use pg_size_pretty on xlog locations? >>> -1 because of the large expense of bigint->numeric->whatever conversion >>> that would be added to existing uses. > >> Given the expense, perhaps we need to different (overloaded) functions instead? > > That would be a workable solution, but I continue to not believe that > this is useful enough to be worth the trouble. There's certainly some use to being able to prettify it. Wouldn't a pg_size_pretty(numeric) also be useful if you want to pg_size_() a sum() of something? Used on files it doesn't make too much sense, given how big those files have to be, but it can be used on other things as well... I can see a usecase for having a pg_size_pretty(numeric) as an option. Not necessarily a very big one, but a >0 one. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, Mar 9, 2012 at 12:38 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Mar 9, 2012 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>> On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Why would it be useful to use pg_size_pretty on xlog locations? >>>> -1 because of the large expense of bigint->numeric->whatever conversion >>>> that would be added to existing uses. >> >>> Given the expense, perhaps we need to different (overloaded) functions instead? >> >> That would be a workable solution, but I continue to not believe that >> this is useful enough to be worth the trouble. > > There's certainly some use to being able to prettify it. Wouldn't a > pg_size_pretty(numeric) also be useful if you want to pg_size_() a > sum() of something? Used on files it doesn't make too much sense, > given how big those files have to be, but it can be used on other > things as well... > > I can see a usecase for having a pg_size_pretty(numeric) as an option. > Not necessarily a very big one, but a >0 one. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On fre, 2012-03-09 at 18:13 +0100, Magnus Hagander wrote: > > and confusing XLogFileSize with XLogSegSize. Not the best choice of > > names. > > Yeah, the use of XLogFile to mean something other than, well a file in > the xlog, is greatly annoying.. I guess we could change it, but it > goes pretty deep in the system so it's not a small change... > The whole thing was built around the lack of 64 bit integers. If we bit the bullet and changed the whole thing to be just a single 64-bit counter, we could probably delete thousands of lines of code.
Peter Eisentraut <peter_e@gmx.net> writes: >> Yeah, the use of XLogFile to mean something other than, well a file in >> the xlog, is greatly annoying.. I guess we could change it, but it >> goes pretty deep in the system so it's not a small change... > The whole thing was built around the lack of 64 bit integers. If we bit > the bullet and changed the whole thing to be just a single 64-bit > counter, we could probably delete thousands of lines of code. Hm. I think "thousands" is an overestimate, but yeah the logic could be greatly simplified. However, I'm not sure we could avoid breaking the existing naming convention for WAL files. How much do we care about that? regards, tom lane
On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >>> Yeah, the use of XLogFile to mean something other than, well a file in >>> the xlog, is greatly annoying.. I guess we could change it, but it >>> goes pretty deep in the system so it's not a small change... > >> The whole thing was built around the lack of 64 bit integers. If we bit >> the bullet and changed the whole thing to be just a single 64-bit >> counter, we could probably delete thousands of lines of code. > > Hm. I think "thousands" is an overestimate, but yeah the logic could be > greatly simplified. However, I'm not sure we could avoid breaking the > existing naming convention for WAL files. How much do we care about > that? Probably not very much, since WAL files aren't portable across major versions anyway. But I don't see why you couldn't keep the naming convention - there's nothing to prevent you from converting a 64-bit integer back into two 32-bit integers if and where needed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> The whole thing was built around the lack of 64 bit integers. If >> we bit the bullet and changed the whole thing to be just a single >> 64-bit counter, we could probably delete thousands of lines of >> code. > > Hm. I think "thousands" is an overestimate, but yeah the logic > could be greatly simplified. However, I'm not sure we could avoid > breaking the existing naming convention for WAL files. How much > do we care about that? We have a few scripts in our backup area that are based around the current WAL file naming convention, so there would be some impact; but I have to believe it would be pretty minor. Most of the pain would be related to the need to support both naming conventions for some transition period. If it simplifies the WAL-related logic, it seems well worth it to me. We just have to know it's coming and be clear on what the new naming rules are. -Kevin
On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Peter Eisentraut <peter_e@gmx.net> writes: >>>> Yeah, the use of XLogFile to mean something other than, well a file in >>>> the xlog, is greatly annoying.. I guess we could change it, but it >>>> goes pretty deep in the system so it's not a small change... >> >>> The whole thing was built around the lack of 64 bit integers. If we bit >>> the bullet and changed the whole thing to be just a single 64-bit >>> counter, we could probably delete thousands of lines of code. >> >> Hm. I think "thousands" is an overestimate, but yeah the logic could be >> greatly simplified. However, I'm not sure we could avoid breaking the >> existing naming convention for WAL files. How much do we care about >> that? > > Probably not very much, since WAL files aren't portable across major > versions anyway. But I don't see why you couldn't keep the naming > convention - there's nothing to prevent you from converting a 64-bit > integer back into two 32-bit integers if and where needed. On further reflection, this seems likely to break quite a few third-party tools. Maybe it'd be worth it anyway, but it definitely seems like it would be worth going to at least some minor trouble to avoid it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Hm. �I think "thousands" is an overestimate, but yeah the logic could be >>> greatly simplified. �However, I'm not sure we could avoid breaking the >>> existing naming convention for WAL files. �How much do we care about >>> that? >> Probably not very much, since WAL files aren't portable across major >> versions anyway. �But I don't see why you couldn't keep the naming >> convention - there's nothing to prevent you from converting a 64-bit >> integer back into two 32-bit integers if and where needed. > On further reflection, this seems likely to break quite a few > third-party tools. Maybe it'd be worth it anyway, but it definitely > seems like it would be worth going to at least some minor trouble to > avoid it. The main actual simplification would be in getting rid of the "hole" at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h: /** We break each logical log file (xlogid value) into segment files of the* size indicated by XLOG_SEG_SIZE. One possiblesegment at the end of each* log file is wasted, to ensure that we don't have problems representing* last-byte-position-plus-1.*/ #define XLogSegSize ((uint32) XLOG_SEG_SIZE) #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize) #define XLogFileSize (XLogSegsPerFile * XLogSegSize) If we can't get rid of that and have a continuous 64-bit WAL address space then it's unlikely we can actually simplify any logic. Now, doing that doesn't break the naming convention exactly; what it changes is that there will be WAL files numbered xxxFFFF (for some number of trailing-1-bits I'm too lazy to work out at the moment) where before there were not. So the question really is how much external code there is that is aware of that specific noncontiguous numbering behavior and would be broken if things stopped being that way. regards, tom lane
On Fri, Mar 9, 2012 at 3:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Hm. I think "thousands" is an overestimate, but yeah the logic could be >>>> greatly simplified. However, I'm not sure we could avoid breaking the >>>> existing naming convention for WAL files. How much do we care about >>>> that? > >>> Probably not very much, since WAL files aren't portable across major >>> versions anyway. But I don't see why you couldn't keep the naming >>> convention - there's nothing to prevent you from converting a 64-bit >>> integer back into two 32-bit integers if and where needed. > >> On further reflection, this seems likely to break quite a few >> third-party tools. Maybe it'd be worth it anyway, but it definitely >> seems like it would be worth going to at least some minor trouble to >> avoid it. > > The main actual simplification would be in getting rid of the "hole" > at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h: > > /* > * We break each logical log file (xlogid value) into segment files of the > * size indicated by XLOG_SEG_SIZE. One possible segment at the end of each > * log file is wasted, to ensure that we don't have problems representing > * last-byte-position-plus-1. > */ > #define XLogSegSize ((uint32) XLOG_SEG_SIZE) > #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize) > #define XLogFileSize (XLogSegsPerFile * XLogSegSize) > > If we can't get rid of that and have a continuous 64-bit WAL address > space then it's unlikely we can actually simplify any logic. > > Now, doing that doesn't break the naming convention exactly; what it > changes is that there will be WAL files numbered xxxFFFF (for some > number of trailing-1-bits I'm too lazy to work out at the moment) where > before there were not. So the question really is how much external code > there is that is aware of that specific noncontiguous numbering behavior > and would be broken if things stopped being that way. I would expect that most things would NOT know about that particular foible, and just be matching pathnames on an RE, which should be fine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 09, 2012 at 03:04:23PM -0500, Tom Lane wrote: > The main actual simplification would be in getting rid of the "hole" > at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h: > > /* > * We break each logical log file (xlogid value) into segment files of the > * size indicated by XLOG_SEG_SIZE. One possible segment at the end of each > * log file is wasted, to ensure that we don't have problems representing > * last-byte-position-plus-1. > */ > #define XLogSegSize ((uint32) XLOG_SEG_SIZE) > #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize) > #define XLogFileSize (XLogSegsPerFile * XLogSegSize) > > If we can't get rid of that and have a continuous 64-bit WAL address > space then it's unlikely we can actually simplify any logic. > > Now, doing that doesn't break the naming convention exactly; what it > changes is that there will be WAL files numbered xxxFFFF (for some > number of trailing-1-bits I'm too lazy to work out at the moment) where > before there were not. So the question really is how much external code > there is that is aware of that specific noncontiguous numbering behavior > and would be broken if things stopped being that way. Our current WAL naming is hopelessly arcane, and we would certainly be benfitting users to simplify it. Is this a TODO? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Sat, Mar 10, 2012 at 3:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 9, 2012 at 12:38 PM, Magnus Hagander <magnus@hagander.net> wrote: >> On Fri, Mar 9, 2012 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Magnus Hagander <magnus@hagander.net> writes: >>>> On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>>> Why would it be useful to use pg_size_pretty on xlog locations? >>>>> -1 because of the large expense of bigint->numeric->whatever conversion >>>>> that would be added to existing uses. >>> >>>> Given the expense, perhaps we need to different (overloaded) functions instead? Agreed. Attached patch introduces the overloaded funtion pg_size_pretty(numeric). >>> That would be a workable solution, but I continue to not believe that >>> this is useful enough to be worth the trouble. >> >> There's certainly some use to being able to prettify it. Wouldn't a >> pg_size_pretty(numeric) also be useful if you want to pg_size_() a >> sum() of something? Used on files it doesn't make too much sense, >> given how big those files have to be, but it can be used on other >> things as well... >> >> I can see a usecase for having a pg_size_pretty(numeric) as an option. >> Not necessarily a very big one, but a >0 one. > > +1. +1, too. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
On Sat, Mar 10, 2012 at 5:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Hm. I think "thousands" is an overestimate, but yeah the logic could be >>>> greatly simplified. However, I'm not sure we could avoid breaking the >>>> existing naming convention for WAL files. How much do we care about >>>> that? > >>> Probably not very much, since WAL files aren't portable across major >>> versions anyway. But I don't see why you couldn't keep the naming >>> convention - there's nothing to prevent you from converting a 64-bit >>> integer back into two 32-bit integers if and where needed. > >> On further reflection, this seems likely to break quite a few >> third-party tools. Maybe it'd be worth it anyway, but it definitely >> seems like it would be worth going to at least some minor trouble to >> avoid it. > > The main actual simplification would be in getting rid of the "hole" > at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h: > > /* > * We break each logical log file (xlogid value) into segment files of the > * size indicated by XLOG_SEG_SIZE. One possible segment at the end of each > * log file is wasted, to ensure that we don't have problems representing > * last-byte-position-plus-1. > */ > #define XLogSegSize ((uint32) XLOG_SEG_SIZE) > #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize) > #define XLogFileSize (XLogSegsPerFile * XLogSegSize) > > If we can't get rid of that and have a continuous 64-bit WAL address > space then it's unlikely we can actually simplify any logic. > > Now, doing that doesn't break the naming convention exactly; what it > changes is that there will be WAL files numbered xxxFFFF (for some > number of trailing-1-bits I'm too lazy to work out at the moment) where > before there were not. So the question really is how much external code > there is that is aware of that specific noncontiguous numbering behavior > and would be broken if things stopped being that way. A page header contains WAL location, so getting rid of "hole" seems to break pg_upgrade. No? Unless pg_upgrade converts noncontinuous location to continuous one, we still need to handle noncontinuous one after upgrade. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao <masao.fujii@gmail.com> writes: > On Sat, Mar 10, 2012 at 5:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The main actual simplification would be in getting rid of the "hole" >> at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h: >> If we can't get rid of that and have a continuous 64-bit WAL address >> space then it's unlikely we can actually simplify any logic. >> ... >> Now, doing that doesn't break the naming convention exactly; what it >> changes is that there will be WAL files numbered xxxFFFF (for some >> number of trailing-1-bits I'm too lazy to work out at the moment) where >> before there were not. So the question really is how much external code >> there is that is aware of that specific noncontiguous numbering behavior >> and would be broken if things stopped being that way. > A page header contains WAL location, so getting rid of "hole" seems to > break pg_upgrade. No? No, why would it do that? The meaning and ordering of WAL addresses is the same as before. The only difference is that after the upgrade, the system will stop skipping over 16MB of potentially usable WAL addresses at the end of each subsequently-used 4GB of space. The holes before the switchover point are still holes, but that doesn't matter. regards, tom lane
On Tue, Mar 13, 2012 at 12:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> On Sat, Mar 10, 2012 at 5:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> The main actual simplification would be in getting rid of the "hole" >>> at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h: >>> If we can't get rid of that and have a continuous 64-bit WAL address >>> space then it's unlikely we can actually simplify any logic. >>> ... >>> Now, doing that doesn't break the naming convention exactly; what it >>> changes is that there will be WAL files numbered xxxFFFF (for some >>> number of trailing-1-bits I'm too lazy to work out at the moment) where >>> before there were not. So the question really is how much external code >>> there is that is aware of that specific noncontiguous numbering behavior >>> and would be broken if things stopped being that way. > >> A page header contains WAL location, so getting rid of "hole" seems to >> break pg_upgrade. No? > > No, why would it do that? The meaning and ordering of WAL addresses is > the same as before. The only difference is that after the upgrade, the > system will stop skipping over 16MB of potentially usable WAL addresses > at the end of each subsequently-used 4GB of space. The holes before > the switchover point are still holes, but that doesn't matter. Oh, I see. You're right. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Mar 12, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> I can see a usecase for having a pg_size_pretty(numeric) as an option. >>> Not necessarily a very big one, but a >0 one. >> >> +1. > > +1, too. I did some beautification of this patch. I think the attached version is cleaner and easier to read. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On Sat, Apr 14, 2012 at 5:30 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Mar 12, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> I can see a usecase for having a pg_size_pretty(numeric) as an option. >>>> Not necessarily a very big one, but a >0 one. >>> >>> +1. >> >> +1, too. > > I did some beautification of this patch. I think the attached version > is cleaner and easier to read. Thoughts? Looks good to me. Thanks for polishing the patch! Regards, -- Fujii Masao
On Sat, Apr 14, 2012 at 7:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Sat, Apr 14, 2012 at 5:30 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Mar 12, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>>> I can see a usecase for having a pg_size_pretty(numeric) as an option. >>>>> Not necessarily a very big one, but a >0 one. >>>> >>>> +1. >>> >>> +1, too. >> >> I did some beautification of this patch. I think the attached version >> is cleaner and easier to read. Thoughts? > > Looks good to me. Thanks for polishing the patch! You're welcome. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company