Thread: Numeric performances
Hi all. I'd like to know whether there is any "real world" evaluation (aka test) on performances of the NUMERIC data type when compared to FLOAT8 and FLOAT4. The documentation simply says that the former is "much slower" than the latter ones. I'd also be interested into data storage evaluations. -- Vincenzo Romano -- Maybe Computer will never become as intelligent as Humans. For sure they won't ever become so stupid. [VR-1988]
Vincenzo Romano escribió: > Hi all. > I'd like to know whether there is any "real world" evaluation (aka test) on > performances of the NUMERIC data type when compared to FLOAT8 and FLOAT4. > The documentation simply says that the former is "much slower" than the latter > ones. It is. But why do you care? You either have the correctness that NUMERIC gives, or you don't. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thursday 31 May 2007 19:42:20 Alvaro Herrera wrote: > It is. But why do you care? You either have the correctness that > NUMERIC gives, or you don't. Because FLOAT8 can be precise enough for some application and not enough for other ones. -- Vincenzo Romano -- Maybe Computer will never become as intelligent as Humans. For sure they won't ever become so stupid. [VR-1988]
On Thu, May 31, 2007 at 09:25:27PM +0200, Vincenzo Romano wrote: > On Thursday 31 May 2007 19:42:20 Alvaro Herrera wrote: > > It is. But why do you care? You either have the correctness that > > NUMERIC gives, or you don't. > > Because FLOAT8 can be precise enough for some application and not > enough for other ones. If you can use float, use it. There's hardware support for that, whereas there's none for numeric... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
At 01:42 AM 6/1/2007, Alvaro Herrera wrote: >Vincenzo Romano escribió: > > Hi all. > > I'd like to know whether there is any "real world" evaluation (aka test) on > > performances of the NUMERIC data type when compared to FLOAT8 and FLOAT4. > > The documentation simply says that the former > is "much slower" than the latter > > ones. > >It is. But why do you care? You either have the correctness that >NUMERIC gives, or you don't. I suspect it's still useful to know what order of magnitude slower it is. After all if it is 1000x slower (not saying it is), some people may decide it's not worth it or roll their own. Any hints/gotchas for/when doing such performance tests? Regards, Link.
>> It is. But why do you care? You either have the correctness that >> NUMERIC gives, or you don't. > > I suspect it's still useful to know what order of magnitude slower it > is. After all if it is 1000x slower (not saying it is), some people may > decide it's not worth it or roll their own. > > Any hints/gotchas for/when doing such performance tests? forum_bench=> CREATE TEMPORARY TABLE test AS SELECT a::FLOAT AS f, a::NUMERIC AS n, a::INTEGER AS i, a::BIGINT AS b FROM generate_series( 1,100000 ) AS a; SELECT Temps : 1169,125 ms forum_bench=> SELECT sum(i) FROM test; Temps : 46,589 ms forum_bench=> SELECT sum(b) FROM test; Temps : 157,018 ms forum_bench=> SELECT sum(f) FROM test; Temps : 63,865 ms forum_bench=> SELECT sum(n) FROM test; Temps : 124,816 ms SELECT * FROM test ORDER BY i LIMIT 1; SELECT * FROM test ORDER BY b LIMIT 1; SELECT * FROM test ORDER BY f LIMIT 1; SELECT * FROM test ORDER BY n LIMIT 1; Temps : 68,996 ms Temps : 68,917 ms Temps : 62,321 ms Temps : 71,880 ms BEGIN; CREATE INDEX test_i ON test(i); CREATE INDEX test_b ON test(b); CREATE INDEX test_f ON test(f); CREATE INDEX test_n ON test(n); ROLLBACK; CREATE INDEX Temps : 102,901 ms CREATE INDEX Temps : 123,406 ms CREATE INDEX Temps : 105,255 ms CREATE INDEX Temps : 134,468 ms
Hmmm ... It sounds quite strange to me that numeric is faster than bigint. Even if bigint didn't get hw support in the CPU it should have been faster that numeric as it should be mapped in 2 32-bits integers. Numeric algorithms should be linear (according to the number of digits) in complexity when compared to float, float8, integer and bigint (that should be constant in my mind). Nonetheless the suggested "fast test" makes some sense in my mind. On Monday 04 June 2007 12:06:47 PFC wrote: > >> It is. But why do you care? You either have the correctness that > >> NUMERIC gives, or you don't. > > > > I suspect it's still useful to know what order of magnitude slower it > > is. After all if it is 1000x slower (not saying it is), some people may > > decide it's not worth it or roll their own. > > > > Any hints/gotchas for/when doing such performance tests? > > forum_bench=> CREATE TEMPORARY TABLE test AS SELECT a::FLOAT AS f, > a::NUMERIC AS n, a::INTEGER AS i, a::BIGINT AS b FROM > generate_series( 1,100000 ) AS a; > SELECT > Temps : 1169,125 ms > > forum_bench=> SELECT sum(i) FROM test; > Temps : 46,589 ms > > forum_bench=> SELECT sum(b) FROM test; > Temps : 157,018 ms > > forum_bench=> SELECT sum(f) FROM test; > Temps : 63,865 ms > > forum_bench=> SELECT sum(n) FROM test; > Temps : 124,816 ms > > SELECT * FROM test ORDER BY i LIMIT 1; SELECT * FROM test ORDER BY b LIMIT > 1; SELECT * FROM test ORDER BY f LIMIT 1; SELECT * FROM test ORDER BY n > LIMIT 1; > Temps : 68,996 ms > Temps : 68,917 ms > Temps : 62,321 ms > Temps : 71,880 ms > > BEGIN; CREATE INDEX test_i ON test(i); CREATE INDEX test_b ON test(b); > CREATE INDEX test_f ON test(f); CREATE INDEX test_n ON test(n); ROLLBACK; > CREATE INDEX > Temps : 102,901 ms > CREATE INDEX > Temps : 123,406 ms > CREATE INDEX > Temps : 105,255 ms > CREATE INDEX > Temps : 134,468 ms -- Vincenzo Romano -- Maybe Computer will never become as intelligent as Humans. For sure they won't ever become so stupid. [VR-1988]
This is a 32 bit CPU by the way. Consider this : - There are 100K rows - The CPU executes about 3 billion instructions per second if everything is optimum - "SELECT sum(n) FROM test", takes, say 60 ms This gives about 1800 CPU ops per row. A Float addition versus an Int addition is a drop in the sea. I believe the marked difference between Floats/Ints (faster) and Numeric/Bigint (slower) comes from being passed by value or by pointers. A single access which misses the CPU cache and has to go fetch data from the real RAM spends a lot more cycles than the simple loops in a NUMERIC addition which will hit L1 cache. Nowadays cache access patterns matter more than how many actual CPU instructions are executed... >> forum_bench=> SELECT sum(i) FROM test; >> Temps : 46,589 ms >> forum_bench=> SELECT sum(f) FROM test; >> Temps : 63,865 ms >> forum_bench=> SELECT sum(b) FROM test; >> Temps : 157,018 ms >> forum_bench=> SELECT sum(n) FROM test; >> Temps : 124,816 ms
I'm getting more confused. If the algorithm used to do the sum is a drop in the sea, then the resources needed to pass a pointer on the stack are a molecule in the drop! :-) Nonetheless I think that your directions are right: doing actual queries instead of inspecting the algorithms themselves should yeld numbers that are (by definition) coherent with real world usage! Another point is related to storage. I think that as far as the storage for a numeric is within few bytes, the difference should be related only to the algorithm. But with larger size, you have no option with floats! So, finally, the question should have been: When used in the same ranges as FLOAT8 or FLOAT, what'd be the performance impact of NUMERIC? Sorry for having been unclear. And thanks for the hints. On Monday 04 June 2007 13:17:49 PFC wrote: > This is a 32 bit CPU by the way. > > Consider this : > > - There are 100K rows > - The CPU executes about 3 billion instructions per second if everything > is optimum > - "SELECT sum(n) FROM test", takes, say 60 ms > > This gives about 1800 CPU ops per row. > A Float addition versus an Int addition is a drop in the sea. > > I believe the marked difference between Floats/Ints (faster) and > Numeric/Bigint (slower) comes from being passed by value or by pointers. > > A single access which misses the CPU cache and has to go fetch data from > the real RAM spends a lot more cycles than the simple loops in a NUMERIC > addition which will hit L1 cache. > Nowadays cache access patterns matter more than how many actual CPU > instructions are executed... > > >> forum_bench=> SELECT sum(i) FROM test; > >> Temps : 46,589 ms > >> forum_bench=> SELECT sum(f) FROM test; > >> Temps : 63,865 ms > >> > >> forum_bench=> SELECT sum(b) FROM test; > >> Temps : 157,018 ms > >> forum_bench=> SELECT sum(n) FROM test; > >> Temps : 124,816 ms -- Vincenzo Romano -- Maybe Computer will never become as intelligent as Humans. For sure they won't ever become so stupid. [VR-1988]
Vincenzo Romano <vincenzo.romano@gmail.com> writes: > It sounds quite strange to me that numeric is faster than bigint. This test is 100% faulty, because it fails to consider the fact that the accumulator used by sum() isn't necessarily the same type as the input data. In fact we sum ints in a bigint and bigints in a numeric to avoid overflow. If you try it with max() you'd likely get less-surprising answers. regards, tom lane
> If you try it with max() you'd likely get less-surprising answers. So it was in fact the type conversions that got timed. Damn. I got outsmarted XDD Rewind : CREATE TEMPORARY TABLE test AS SELECT a::FLOAT AS f, (a::NUMERIC)*100000000000000 AS n, a::INTEGER AS i, a::BIGINT AS b FROM generate_series( 1,100000 ) AS a; Max and Sort will use comparisons : SELECT max(i) FROM test; SELECT max(b) FROM test; SELECT max(f) FROM test; SELECT max(n) FROM test; Temps : 42,132 ms Temps : 59,499 ms Temps : 58,808 ms Temps : 54,197 ms SELECT * FROM test ORDER BY i LIMIT 1; SELECT * FROM test ORDER BY b LIMIT 1; SELECT * FROM test ORDER BY f LIMIT 1; SELECT * FROM test ORDER BY n LIMIT 1; Temps : 58,723 ms Temps : 60,520 ms Temps : 53,188 ms Temps : 61,779 ms SELECT count(*) FROM test a JOIN test b ON (a.i=b.i); Temps : 275,411 ms SELECT count(*) FROM test a JOIN test b ON (a.b=b.b); Temps : 286,132 ms SELECT count(*) FROM test a JOIN test b ON (a.f=b.f); Temps : 295,956 ms SELECT count(*) FROM test a JOIN test b ON (a.n=b.n); Temps : 321,292 ms SELECT count(*) FROM test a JOIN test b ON (a.i=b.b); Temps : 281,162 ms SELECT count(*) FROM test a JOIN test b ON (a.n=b.i::NUMERIC*100000000000000); Temps : 454,706 ms Now, addition : SELECT count(i+1) FROM test; Temps : 46,973 ms SELECT count(b+1) FROM test; Temps : 60,027 ms SELECT count(f+1) FROM test; Temps : 56,829 ms SELECT count(n+1) FROM test; Temps : 103,316 ms Multiplication : SELECT count(i*1) FROM test; Temps : 46,950 ms SELECT count(b*1) FROM test; Temps : 58,670 ms SELECT count(f*1) FROM test; Temps : 57,058 ms SELECT count(n*1) FROM test; Temps : 108,379 ms SELECT count(i) FROM test; Temps : 38,351 ms SELECT count(i/1234) FROM test; Temps : 48,961 ms SELECT count(b/1234) FROM test; Temps : 62,496 ms SELECT count(n/1234) FROM test; Temps : 186,674 ms Conclusion : numeric seems a bit slower (division being slowest obviously) but nothing that can't be swamped by a few disk seeks...