Home > mailing lists

Re: Reducing the overhead of NUMERIC data - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Reducing the overhead of NUMERIC data
Date	November 1, 2005 19:01:36
Msg-id	1130885963.8300.1713.camel@localhost.localdomain Whole thread Raw
In response to	Re: Reducing the overhead of NUMERIC data (Martijn van Oosterhout <kleptog@svana.org>)
Responses	Re: Reducing the overhead of NUMERIC data
List	pgsql-hackers

Tree view

On Tue, 2005-11-01 at 23:16 +0100, Martijn van Oosterhout wrote:

lots of useful things, thank you.

> > So, assuming I have this all correct, means we could reduce the on disk
> > storage for NUMERIC datatypes to the following struct. This gives an
> > overhead of just 2.5 bytes, plus the loss of the optimization of
> > trailing zeroes, which I assess as having almost no value anyway in
> > 99.9999% of data values (literally...).
> 
> Actually, I have a table with a column declared as numeric(12,4)
> because there has to be 4 decimal places. As it turns out, the decimal
> places are mostly zero so the optimisation works for me.

Of course it fits some data. The point is whether it is useful for most
people's data.

My contention is that *most* (but definitely nowhere near all) NUMERIC
data is either financial or measured data. That usually means it has
digits that follow Benfold's Law - which for this discussion is a
variant on a uniform random distribution.

Optimizing for trailing zeroes just isn't worth the very minimal
benefits, in most cases. It doesn't really matter that it saves on
storage and processing time in those cases - Amdahl's Law says we can
ignore that saving because the optimized case is not prevalent enough
for us to care.

Anybody like to work out a piece of SQL to perform data profiling and
derive the distribution of values with trailing zeroes? I'd be happy to
be proved wrong with an analysis of real data tables.

Best Regards, Simon Riggs

pgsql-hackers by date:

From: Tom Lane
Date: 01 November 2005, 18:55:41
Subject: Re: Reducing the overhead of NUMERIC data

From: Chris Browne
Date: 01 November 2005, 19:04:11
Subject: Re: slru.c race condition

Re: Reducing the overhead of NUMERIC data - Mailing list pgsql-hackers

Previous

Next