hello everybody,
i have just come across some issue which has been bugging me for a while.
consider:
SELECT * FROM foo ORDER BY bar;
if we have an index on bar, we can nicely optimize away the sort step by consulting the index - a btree will return
sortedoutput.
under normal circumstances it will be seq->sort but doing some config settings we can turn this into an index scan
nicelyto avoid to the sort (disk space is my issue here).
this is not so easy anymore:
create table foo ( x date );create table foo_2010 () INHERITS (foo)create table foo_2009 () INHERITS (foo)create table
foo_2008() INHERITS (foo)
now we add constraints to make sure that data is only in 2008, 2009 and 2010.
we assume that everything is indexed:
SELECT * FROM foo ORDER BY bar will now demand an ugly sort for this data.
this is not an option if you need more than a handful of rows ...
if constraints are non overlapping and if they are based on a "sortable" data type, we might be able to scan one index
afterthe other and get a sorted list.
why is this an issue? imagine a case where you want to do billing, eg. some phone calls. the job now is: the last 10
callsof a customer are free and you want to sum up those which are not free.
to do that you basically need a sorted list per customer. if you have data here which is partitioned over time you are
screwedup because you want to return a sorted list taken from X partitions to some higher level operation (windowing or
whatever).
resorting vast amounts of data is a killer here. in the particular case i am talking about my problem is roughly 2 TB
scaledout to some PL/proxy farm.
does anybody see a solution to this problem?
what are the main showstoppers to make something like this work?
many thanks,
hans
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de