Thread: Performance implications of numeric?

Performance implications of numeric?

From
Wells Oliver
Date:
We have a lot of tables which store numeric data. These tables all use the numeric type, where the values are 95% integer values. We used numeric because it eliminated the need for casting during division to yield a floating point value.

I'm curious as to whether this would have performance and/or disk size implications. Would converting these columns to integer (or double precision on the handful that require the precision) and forcing developers to use explicit casting be worth the time?

Thanks for any clarification.

--
Wells Oliver
wellsoliver@gmail.com

Re: Performance implications of numeric?

From
Craig Ringer
Date:
On 08/22/2012 12:27 PM, Wells Oliver wrote:
> We have a lot of tables which store numeric data. These tables all use
> the numeric type, where the values are 95% integer values. We used
> numeric because it eliminated the need for casting during division to
> yield a floating point value.
>
> I'm curious as to whether this would have performance and/or disk size
> implications.

Yes, and yes, though the gap seems to have shrunk a lot since I first
started using Pg.

It's easy to concoct fairly meaningless micro-benchmarks, but you should
really try it with some real queries you run on your real schema. Take a
copy of your data, convert it, and run some tests. Use
`pg_total_relation_size` to compare the numeric and int versions of the
relations after `CLUSTER`ing them to debloat and reindex them.

> Would converting these columns to integer (or double
> precision on the handful that require the precision) and forcing
> developers to use explicit casting be worth the time?

Without knowing your workload and your constraints, that's a "how blue
is the sky" question.

--
Craig Ringer


Re: Performance implications of numeric?

From
Craig Ringer
Date:
On 08/23/2012 12:48 AM, Wells Oliver wrote:
> Hey, thanks for your feedback. Just to clarify: pg_total_relation_size
> returns bytes, correct?

Please reply to the list, not directly to me.

Yes, pg_total_relation_size returns bytes. The documentation
(http://www.postgresql.org/docs/9.1/static/functions-admin.html) doesn't
seem to explicitly say that for pg_total_relation_size though it does
for pg_relation_size and other functions.

Use pg_size_pretty to convert bytes to "human" values for display.

--
Craig Ringer



Re: Performance implications of numeric?

From
Merlin Moncure
Date:
On Tue, Aug 21, 2012 at 11:27 PM, Wells Oliver <wellsoliver@gmail.com> wrote:
> We have a lot of tables which store numeric data. These tables all use the
> numeric type, where the values are 95% integer values. We used numeric
> because it eliminated the need for casting during division to yield a
> floating point value.
>
> I'm curious as to whether this would have performance and/or disk size
> implications. Would converting these columns to integer (or double precision
> on the handful that require the precision) and forcing developers to use
> explicit casting be worth the time?
>
> Thanks for any clarification.

Calculations against numeric are several orders of magnitude slower
than native binary operations.  Fortunately the time the database
spends doing these types of calculations is often a tiny fraction of
overall execution time and I advise giving numeric a whirl unless you
measure a big performance hit.  Let's put it this way: native binary
types are a performance hack that come with all kinds of weird baggage
that percolate up and uglify your code: your example given is a
classic case in point.  Database "integer" types are not in fact
integers but a physically constrained approximation of them.  Floating
point types are even worse.

Another example: I just found out for the first time (after many years
of programming professionally) that -2147483648 / -1 raises a hardware
exception: this is exactly the kind of thing that makes me think that
rote use of hardware integer types is a terribly bad practice.

merlin