Re: Reducing the overhead of NUMERIC data - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Reducing the overhead of NUMERIC data
Date
Msg-id 1130885963.8300.1713.camel@localhost.localdomain
Whole thread Raw
In response to Re: Reducing the overhead of NUMERIC data  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Reducing the overhead of NUMERIC data
List pgsql-hackers
On Tue, 2005-11-01 at 23:16 +0100, Martijn van Oosterhout wrote:

lots of useful things, thank you.

> > So, assuming I have this all correct, means we could reduce the on disk
> > storage for NUMERIC datatypes to the following struct. This gives an
> > overhead of just 2.5 bytes, plus the loss of the optimization of
> > trailing zeroes, which I assess as having almost no value anyway in
> > 99.9999% of data values (literally...).
> 
> Actually, I have a table with a column declared as numeric(12,4)
> because there has to be 4 decimal places. As it turns out, the decimal
> places are mostly zero so the optimisation works for me.

Of course it fits some data. The point is whether it is useful for most
people's data.

My contention is that *most* (but definitely nowhere near all) NUMERIC
data is either financial or measured data. That usually means it has
digits that follow Benfold's Law - which for this discussion is a
variant on a uniform random distribution.

Optimizing for trailing zeroes just isn't worth the very minimal
benefits, in most cases. It doesn't really matter that it saves on
storage and processing time in those cases - Amdahl's Law says we can
ignore that saving because the optimized case is not prevalent enough
for us to care.

Anybody like to work out a piece of SQL to perform data profiling and
derive the distribution of values with trailing zeroes? I'd be happy to
be proved wrong with an analysis of real data tables.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Reducing the overhead of NUMERIC data
Next
From: Chris Browne
Date:
Subject: Re: slru.c race condition