Re: [PATCHES] Maintaining cluster order on insert - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PATCHES] Maintaining cluster order on insert
Date
Msg-id 9461.1155178183@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PATCHES] Maintaining cluster order on insert  (Gene <genekhart@gmail.com>)
Responses Re: [PATCHES] Maintaining cluster order on insert
List pgsql-hackers
Gene <genekhart@gmail.com> writes:
> I have a table that inserts lots of rows (million+ per day) int8 as primary
> key, and I cluster by a timestamp which is approximately the timestamp of
> the insert beforehand and is therefore in increasing order and doesn't
> change. Most of the rows are updated about 3 times over time roughly within
> the next 30 minutes.

ISTM you should hardly need to worry about clustering that --- the data
will be in timestamp order pretty naturally.

The main problem you're going to have is the update-3-times bit.  You
could keep updated rows on the same page as the original if you ran the
table at fillfactor 25% (which you'll be able to do in 8.2) ... but
while this might be sane for the leading edge of the table, you hardly
want such low storage density in the stable part.

You could reduce the fillfactor requirement if you could vacuum the
table constantly (every 10 minutes or so) but I assume the table is
large enough to make that unattractive.  (Eventually we should have
a version of vacuum that understands where the dirty stuff is, which
might make this approach tenable ... but not in 8.2.)

Your best bet might be to partition the table into two subtables, one
with "stable" data and one with the fresh data, and transfer rows from
one to the other once they get stable.  Storage density in the "fresh"
part would be poor, but it should be small enough you don't care.

            regards, tom lane

pgsql-hackers by date:

Previous
From: Perez
Date:
Subject: Re: An Idea for planner hints
Next
From: Alvaro Herrera
Date:
Subject: Re: Buildfarm failure on ecpg/test/pgtypeslib