Thread: BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient
BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient
From
"John Rylander"
Date:
The following bug has been logged online: Bug reference: 2477 Logged by: John Rylander Email address: rylander@prolexia.com PostgreSQL version: 8.1.4 Operating system: Windows XP :( Description: Aggregate Integer divisors incorrectly yield integer-type quotient Details: When for some reason I did a sum/count instead of an avg, the result was always an integer (either 0 or 1): Sum( Case when Sign(Delta)=Sign(AvgDelta) then 1 else 0 ) / Count(*) as HitRate; If I use "/ Count(*)::float4", or use Avg instead of Sum/Count, it works fine. Yet another maddening PostgreSQL bug/idiosyncrasy. Time for this boy to go to Oracle (not because of this, but because of the function->transaction problem, such that a function that has nested loops that create myriad temporary analytical tables necessarily runs out of disk space; if it weren't for the mandatory and completely counterproductive implicit transaction, it'd take very little space and a lot less time). :(
Re: BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient
From
Tom Lane
Date:
"John Rylander" <rylander@prolexia.com> writes: > When for some reason I did a sum/count instead of an avg, the result was > always an integer (either 0 or 1): What's your point? Postgres has always defined division of integers as yielding an integer result. AFAICS this choice is fully sanctioned by the SQL standard: 1) If the data type of both operands of a dyadic arithmetic opera- tor is exact numeric, then the data type of the result is exact numeric, with precision and scale determined as follows: ... d) The precision and scale of the result of division is implementation-defined. (SQL92 6.12 <numeric value expression> syntax rule 1) > Yet another maddening PostgreSQL bug/idiosyncrasy. Time for this boy to go > to Oracle Sayonara. Last I checked, they had a few idiosyncrasies too. regards, tom lane
Re: BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient
From
"Jim C. Nasby"
Date:
On Sun, Jun 11, 2006 at 09:08:00PM +0000, John Rylander wrote: > > The following bug has been logged online: > > Bug reference: 2477 > Logged by: John Rylander > Email address: rylander@prolexia.com > PostgreSQL version: 8.1.4 > Operating system: Windows XP :( > Description: Aggregate Integer divisors incorrectly yield > integer-type quotient > Details: > > When for some reason I did a sum/count instead of an avg, the result was > always an integer (either 0 or 1): > > Sum( > Case when Sign(Delta)=Sign(AvgDelta) then > 1 > else > 0 > ) / Count(*) as HitRate; > > If I use "/ Count(*)::float4", or use Avg instead of Sum/Count, it works > fine. Do you have a test case for this? > Yet another maddening PostgreSQL bug/idiosyncrasy. Time for this boy to go > to Oracle (not because of this, but because of the function->transaction > problem, such that a function that has nested loops that create myriad > temporary analytical tables necessarily runs out of disk space; if it > weren't for the mandatory and completely counterproductive implicit > transaction, it'd take very little space and a lot less time). :( Huh? What implicit transaction? Do you have an example that illustrates this problem? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Re: BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient
From
"John E. Rylander"
Date:
Jim, First, I got a minor rebuke ;) from Tom Lane on this matter, he rightly pointing out to me that this integer division "bug" is in fact a part of the SQL standard. For the simplest illustration of it, try this: Select 2/3; It comes back with zero! To me, this is unbelievable; but it's not a bug, and it's not PostgreSQL's fault, as it were. As Tom put it: >What's your point? Postgres has always defined division of integers as >yielding an integer result. AFAICS this choice is fully sanctioned by >the SQL standard: > > 1) If the data type of both operands of a dyadic arithmetic opera- > tor is exact numeric, then the data type of the result is exact > numeric, with precision and scale determined as follows: > ... > d) The precision and scale of the result of division is > implementation-defined. I suppose point d) leaves PostgreSQL developers some freedom here, but it also, particularly when combined with 1), gives an apparent okay to what is done now. (After all, the result is "exact", even if it's exactly wrong! :) ) It's interesting that this isn't wholly consistent: the AVG aggregate, e.g., doesn't follow this rule in the same way, even though it's strictly composed of such dyadic operations-and it's a good thing, too. I reported this as a bug because I never considered the possibility that this could be by design. Truncating or rounding to an integer manually, or when assigning a floating point value to an integer variable or field, sure; but doing it that way even when assigning the result to a floating point field? I have no idea why this is considered desirable, and have to wonder if that's the best way of interpreting the standard, but I'm an SQL and (hence) PostgreSQL newbie, so.... Second, and this is the real showstopper for me, it's my understanding that PostgreSQL (unlike, I've read, most other SQL RDBMS) implicitly and necessarily wraps every user-defined function call in a transactional wrapper. Typically, this seems like a sound approach-no worries then when a function fails. But when it's mandatory, one runs into problems with highly query-intensive and analytical functions. I have a PL/PGSQL analytical function that iteratively drops a table, queries to create a new table of the same name based on iterated parameters, and then runs numerous queries on the resulting table, storing a few summary records to a log table. The nested loops in the function intelligently iterate tens or hundreds of thousands of times on tables containing tens or low-hundreds of thousands of records. The problem is, I think, since this function is involuntarily wrapped in a transaction, PostgreSQL wastes a TON of time and space keeping track of every single interim table, and eventually (pretty soon, in fact) it runs out of disk space. (I have only 20 GB free, but the same thing might happen if I had 200 GB free.) This involuntary transaction wrapping has another serious downside: it makes debugging functions harder, since one can't do any table-based "bomb damage assessment" when a function blows up-the target site is all cleaned up (i.e., rolled back) before one can take a look. (I've read that Oracle does NOT do this; OTOH, since Oracle does (I believe) wrap each SQL statement in a transactional wrapper, then it seems a bit ambiguous as to how that applies to an SQL statement that calls a stored procedure. I sure HOPE that it doesn't wrap the call in a transaction, however appealing that might be at first glance!) If I'm misunderstanding things, I'd greatly appreciate correction on these matters! As it is now, I may need painfully to switch to Oracle :( just to get my analytical functions to work properly on anything but very small scale tables-otherwise I just run of out disk space a small part of the way through. Thanks for any guidance you can offer on this-I'd love to stick with PostgreSQL! Best regards, John Rylander -----Original Message----- From: Jim C. Nasby [mailto:jnasby@pervasive.com] Sent: Monday, June 12, 2006 10:48 AM To: John Rylander Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient On Sun, Jun 11, 2006 at 09:08:00PM +0000, John Rylander wrote: > > The following bug has been logged online: > > Bug reference: 2477 > Logged by: John Rylander > Email address: rylander@prolexia.com > PostgreSQL version: 8.1.4 > Operating system: Windows XP :( > Description: Aggregate Integer divisors incorrectly yield > integer-type quotient > Details: > > When for some reason I did a sum/count instead of an avg, the result was > always an integer (either 0 or 1): > > Sum( > Case when Sign(Delta)=Sign(AvgDelta) then > 1 > else > 0 > ) / Count(*) as HitRate; > > If I use "/ Count(*)::float4", or use Avg instead of Sum/Count, it works > fine. Do you have a test case for this? > Yet another maddening PostgreSQL bug/idiosyncrasy. Time for this boy to go > to Oracle (not because of this, but because of the function->transaction > problem, such that a function that has nested loops that create myriad > temporary analytical tables necessarily runs out of disk space; if it > weren't for the mandatory and completely counterproductive implicit > transaction, it'd take very little space and a lot less time). :( Huh? What implicit transaction? Do you have an example that illustrates this problem? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Re: BUG #2477: Aggregate Integer divisors incorrectly yield integer-type quotient
From
Tom Lane
Date:
"John E. Rylander" <rylander@prolexia.com> writes: > ... I have a PL/PGSQL analytical > function that iteratively drops a table, queries to create a new table of > the same name based on iterated parameters, and then runs numerous queries > on the resulting table, storing a few summary records to a log table. The > nested loops in the function intelligently iterate tens or hundreds of > thousands of times on tables containing tens or low-hundreds of thousands of > records. > The problem is, I think, since this function is involuntarily > wrapped in a transaction, PostgreSQL wastes a TON of time and space keeping > track of every single interim table, and eventually (pretty soon, in fact) > it runs out of disk space. It might be possible to teach the system that it could immediately unlink the table file when dropping a table that was created in the current transaction. See the NOTE in smgrscheduleunlink() for a hint about one way to do it. I think you'd have to worry about matching subtransaction nest levels, but if they're the same then allowing inverse unlink requests to "blow each other up" would work. regards, tom lane